you have to think top-down. First, find an application domain of interest, then define a problem that related to Big Data that the domain you have chosen suffers from, think for nontrivial solutions to save processing resources, and getting better performance rather than the literature.
Minimal data sets, i.e. learning more from less data in a reliable fashion. One body of thought perceives a challenge to big data analytics as not having enough data. If there were 'minimal' data analytical techniques that were reliable, it would be helpful.