thank you very much for your reply, you give us many constructive thinking. so that means the bioinformatics development should keep pace with the development of sequencing technology or machine. it is the trend. I really believe the NGS has a brilliant tomorrow. and it will be important and indispensable to biology research.
That was a very coherent answer from Rohan. In fact the issues are not endemic to human/clinical studies - in plant science we have much the same problems. Just to add here that in addition to software and tools for data interpretation, which are clearly a huge bottleneck, another key issue is human resources. Training new generations to be both bioinformaticians as well as good bioinformatics users (scientists who are not computational biologists but know enough about the process to know the limits and pitfalls of the data analysis process) will be the key.
Also, just to give an example of how different aproaches used on very similar sets of data (RNA "structurome" in human, yeast and plants) can explore and reveil different things - take a look at the recently published papers on RNA structure in Nature:
This is just my perspective, but for much of NGS analysis, what is sorely lacking is established best practices and robust statistics. For RNA-Seq alone, there must be at least a dozen competing methods of data normalization, and at least as many methods of statistical testing. Often, applying those to the same data set will give such different results that one's entire biological interpretation may be altered form one method to another. Seems to me that one of the often under-appreciated aspects of NGS data is the dynamic range of the data, and the inherent problems that presents in terms of issues like normalization across the actual range of count data obtained in studies.
Micro-arrays also have multiple methods for normalization and analysis, but the differences between them, in terms of final results and interpretation, tend to be relatively minor for a given data set. Micro-array analysis did take some time, and a great deal of effort to arrive at robust guidelines or best practices for analyses, so I'm hopeful that NGS will also arrive there in time. It is not there yet, in my opinion.
The other issue I see all the time, again primarily again in RNA-seq studies, is a complete disregard for proper sampling. Nothing about NGS data obviates the need for proper biological replication in comparative population studies. I know people reduce sample sizes when deciding to go with NGS due to cost and/or time or effort required to process samples. But without proper population sampling, an NGS study is just as bad science as any other population genetics study with an inadequate sampling design.
In that regard, NGS still needs to develop in order to allow for very large sample studies. For now, in many companies and institutions, NGS is still more cumbersome than micro-arrays to do that sort of very large scale population studies for things like chemical or drug risk assessment analyses. This is especially so for companies that are obligated to do their work entirely in-house, as they have to cover not only instrument and consumables cost, but also personal salaries and overhead. In such situations, micro-arrays are still superior in terms of sheer sample throughput.
Again, just my perspective given my own area of interest and focus. Most toxicogenomics is still done with micro-arrays these days, both due to the need for very large sample sizes (so very high throughput), and given the not insignificant issues of data normalization and statistical analyses. The latter is also something that stands in the way for any studies related to regulated compounds, as government regulatory bodies are always loathe to accept data that abandons established practices for ones where methods are still much in debate.