Please share your views on future of big data?

There are many future important challenges in Big Data management and analytics, that arise from the nature of data: large, diverse, and evolving . These are some of the challenges that researchers and practitioners will have to deal during the next years:

Analytics Architecture. It is not clear yet how an optimal architecture of an analytics systems should be to deal with historic data and with real-time data at the same time. An interesting proposal is the Lambda architecture of Nathan Marz. The Lambda Architecture solves the problem of computing arbitrary functions on arbitrary data in realtime by decomposing the problem into three layers: the batch layer, the serving layer, and the speed layer. It combines in the same system Hadoop for the batch layer, and Storm for the speed layer. The properties of the system are: robust and fault tolerant, scalable, general, extensible, allows ad hoc queries, minimal maintenance, and debuggable.

Statistical signicance. It is important to achieve signicant statistical results, and not be fooled by randomness. As Efron explains in his book about Large Scale Inference, it is easy to go wrong with huge data sets and thousands of questions to answer at once.

Distributed mining. Many data mining techniques are not trivial to paralyze. To have distributed versions of some methods, a lot of research is needed with practical and theoretical analysis to provide new methods.

Time evolving data. Data may be evolving over time, so it is important that the Big Data mining techniques should be able to adapt and in some cases to detect change. For example, the data stream mining has very powerful techniques for this task.

Compression: Dealing with Big Data, the quantity of space needed to store it is very relevant. There are two main approaches: compression where we don't loose anything, or sampling where we choose what is the data that is more representative. Using compression, we may take more time and less space, so we can consider it as a transformation from time to space. Using sampling, we are loosing information, but the gains in space may be in orders of magnitude. For example Feldman et al. use coresets to reduce the complexity of Big Data problems. Coresets are small sets that provably approximate the original data for a given problem. Using merge-reduce the small sets can then be used for solving hard machine learning problems in parallel.

Visualization. A main task of Big Data analysis is how to visualize the results. As the data is so big, it is very difficult to and user-friendly visualizations. New techniques, frameworks to tell and show stories will be needed, as for example the photographs, infographics and essays in the beautiful book "The Human Face of Big Data"

Hidden Big Data. Large quantities of useful data are getting lost since new data is largely untagged file based and unstructured data. The 2012 IDC study on Big Data explains that in 2012, 23% (643 exabytes)of the digital universe would be useful for Big Data if tagged and analyzed. However, currently only 3% of the potentially useful data is tagged, and even less is analyzed.

Shailna Patidar

This is going to happen with Big Data and it's solutions/applications

1. Data volumes will continue to grow.

2. Ways to analyse data will improve.

3. More tools for analysis (without the analyst) will emerge

4. Prescriptive analytics will be built in to business analytics software.

5. In addition, real-time streaming insights into data will be the hallmarks of data winners

6. Machine learning is a top strategic trend

7. Big data will face huge challenges around privacy

8. More companies will appoint a chief data officer.

9. Autonomous agents and things” will continue to be a huge trend

10. Big data staffing shortages will expand

11. But the big data talent crunch may ease

12. The data-as-a-service business model is on the horizon.

13. Algorithm markets will also emerge

14. Cognitive technology will be the new buzzword.

15. All companies are data businesses now

Looking forward to your submissions in "International Journal of Business and Data Analytics" published by Inderscience ?

Why big data is getting so much attention from researchers?

Challenges and opportunities of BIG DATA?

How to calculate AVE and CR for constructs in structural equation modelling?

What are steps to run neural network in R?

What are the differences between Discriminant Analysis and Logistic Regression?

R or SPSS: Which statistical software is better for business applications?

Can anyone suggest an open source software other than WEKA for neural network modeling?

Can someone suggest Data Mining projects for undergraduate students?

Can anyone explain how a neural network model is better than a regression model?

How to learn more about SPSS and its Application?

Handling Missing Data and Building a Predictive Model with Incomplete Information ?

Has anyone applied Python in the field of textile engineering for data analysis, automation, or smart textiles?

Is Galaxy.org good to use for research for analyzing data and for publication?

Do experts have journals in the field of artificial intelligence and big data that are not indexed by SCI or EI?

What are possible strategies can be used to analyze data under sequential explanatory mixed method approach?

How can I interpret the data without the need of solving it manually?

Why can't academics earn the money they deserve?

Conjugation of PEG-Amine to an Amino Acid Using EDC?

How Do Project Data Analytics and AI Advance Quality 4.0 in Construction Project Management?