Depends... I don't really agree that prices are the same, if you have both devices (a good scanner and a good sequencer) in-house. Real in-depth transcriptome seq is still expensive, whereas Affy or Agilent chips are really cheap nowadays. Moreover, we've had fifteen years to make sense of microarray normalization and interpretation, and a wealth of good tools exist to analyze them, while the same cannot be said yet for mRNA-seq. It really depends whether you have money, good bioinfo people, and you trust who makes your experiments. Most important, it depends on your biological hypothesis: if you want to assess rare transcripts, splice variants or fusion transcripts, there is no comparison. If you want to just do a gene expression analysis, GSEA or classification/prediction analysis, I would go for microarrays. Even in this case, if you have enough cDNA I would store an aliquot for futher comparisons between the two methods!
I would choose next generation sequencing for my next expression profiling study. The cost is dropping fast and its unbiased nature are indeed preferable over microarrays. In addition, the potential to study other mechanisms such as alternative splicing or RNA editing opens up a whole new array of possibilities.
This clearly depends on the aims of the study. Many people do profiling to get an idea which known genes and pathways might be involved in a biological response (to a treatment, to a disease, to a cell development...). Microarrays are perfectly fine here. Others try to get candidates or RNA species outside of the known spectrum of genes. They go for NGS. I think the number of genes is not the point. Microarrays can cover the vast majority of sequences, they are on a state of development where they become reliable - even on problematic samples. Their limitations and possible pitfalls are quite well known, what is not yet the case for NGS. NGS will develop to become analytically handable, more reliable, and cheaper; at the same time microarrays will become more dense and much cheaper. As microarrays did not swamp out qPCR, NGS will not supersede microarrays. Both techniques will complement each other.
It depends as the former comments clearly show. if you like to compare it is easier to have a look on microarrays. Meaning it literarily, with a grenn/red experiment you will direct find the changes.
NGS is often better in terms of absolute quantification, but the bias is still questionable. We envisioned that some sequences don't come up in the NGS. So they simply get lost or show reduced yield in amplification for the preparation of the NGS chip. Maybe there are ways to prevent the bias completely, but we envisioned always some bias.
As for the costs it also depends on throughput and number of measurements. Small number of experiments is cheaper with NGS, especially if you only subcontract it. Medium amounts by NGS, but very very large amounts I would let make me many microarrays and go for classical tests.
But if unknown sequences are expected the only answer is NGS, cause thats the only way then.
Next Generation Sequencing (RNA-Seq) is cost comparable to microarrays and is linear over a wider range. While more material is required for sequencing, you can overcome low amounts by amplification. Sequencing has the added benefit of detection of splice variants and ncRnAs while as Nallasivam notes below arrays are limited to which probes are present on the array. Within the linear range given enough material, microarray and next generation sequencing are comparable.
Depends... I don't really agree that prices are the same, if you have both devices (a good scanner and a good sequencer) in-house. Real in-depth transcriptome seq is still expensive, whereas Affy or Agilent chips are really cheap nowadays. Moreover, we've had fifteen years to make sense of microarray normalization and interpretation, and a wealth of good tools exist to analyze them, while the same cannot be said yet for mRNA-seq. It really depends whether you have money, good bioinfo people, and you trust who makes your experiments. Most important, it depends on your biological hypothesis: if you want to assess rare transcripts, splice variants or fusion transcripts, there is no comparison. If you want to just do a gene expression analysis, GSEA or classification/prediction analysis, I would go for microarrays. Even in this case, if you have enough cDNA I would store an aliquot for futher comparisons between the two methods!
moreover i really don't think amplification based enrichment of rare transcript is a way out. It would include bias to the expression profile (differential efficiency of PCR reactions). And if you can amplify to higher quantity, why not RT/microarray. In my opinion , I still prefer RACE assay .
As someone who has used both Microarray and RNAseq techniques, I would unequivocally say RNAseq (NGS) is by far the better of the two:
Greater sensitivity thanks to a wider dynamic range and no cross hybridization
More information because you do not need to have any prior knowledge about the genome transcribing the mRNAs (i.e. what ORFs are real). It has also been shown countless times to have greater reproducibility so is more robust. It is also alot easier to compare across experiments.
we are talking lot about NGS sensitivity, but the basic annotation itself is quite fuzzy in NGS ( as far as my understanding goes). Between flow cell read variability is also an issue to be kept in mind ( al though spike control may take care of it). In some case you may loose less frequenct transcript in this situation. Splice variant call and also tss calling itself is quite fuzzy to me ( if any one has anything to guide me please do that). As a matter of fact sample prep to library prep to template prep, every step is quite sensitive. So little handling error may alter the outcomes too.... If you have suggestion for me, please come forward
1., it unbiassed with respect to the loci, that is, does not depend on the expected ORFs, but it can detect unexpected ORFs, ncRNA and generally speaking anything that is there.
2., it is unbiassed with respect to GC content, so AT rich regions are not overlooked
3., the broader range of detection sensitivity gives much deaper insight, and thus (although qPCR validation is still a must usually) it may provide you information that otherwise qPCR would
4., you sequence what is there, from the first nucleotides to the last ones, therefore RACE may not be necessary
5., you will see splice variants, somatic mutants and SNPs, so the cost and time and effort for those kind of investigations are also saved
So, if one has some limited number of known genes in mind (pathway study or similar) for what the microarray is in the drawer already and the reagents are in the freezer, then go ahead with the chip. Otherwise the NGS seems to give more information with higher quality and if the whole investigation is considered, it needs less money, time and effort.
I guess, with the low abundance RNAs, the difference between the two methods is that with NGS you are somewhat likely to miss them, whereas with the microarray you will surely miss them.
Otherwise I fully agree. Although these methods are both still developing and getting better and better, but already they are extremely powerful and getting reasonably robust. Sample collection, handling and purification should be done with extra care or else some biased junk will be sequenced at extra high quality.
Microarray gene expression studies are useful for generating preliminary data. Of late, the practical feedback from grant review panels is that if funding is going to be spent generating expression data, why not get the in situ picture of transcriptomics instead of merely comparisons using measurements from sets of probes that we as a default must agree to call a 'transcript'? My personal opinion is that high-quality results can be acquired using either type of platform. With microarrays, we use internal reproducibility of the differential expression inference as a method to choose among the many types of transformation/normalization (i.e., our Efficiency Analysis informs us which methods to choose). With NGS, one must also be sure to correct for sequence length in their use of coverage as a proxy measure for expression. Although on the surface it may appear simple, unless one uses internal standards, or employs a reference study design, comparisons among different transcripts is especially challenging; 'coverage' actually a complex function of the effects of the local sequence characteristics on amplification and sequencing efficiency. Even as genome sequences show variation in coverage due to local sequence effects, the transcriptome will, as well. Overcoverage is also a risk with NGS; one can mistakenly reduce differences by maxing out the coverage for some parts of the transcriptome. In our methodological research, we are working studying the problem of finding out how much of the available coverage one might use to maximize the reproducibility of the inference of differential expression. If you're a methodologist, there are many areas for formal methodological development & research for optimization of data representation of NGS based transcriptomics with many open problems worth looking into. I am attaching a .pps of a presentation I gave to the University of Pittsburgh NGS Bioinformatics community earlier this year on RNASeq Data Analysis. It raises a few important questions, and answers fewer.
I don't see it being a clear cut choice, not from a pragmatic viewpoint. If you want to get comparable sensitivity and coverage with RNAseq, you are going to need a LOT of sequence data (as in a high read depth per sample). If you are doing population level characterization or risk assessment, you may also be looking at a lot of samples and replicates. Add in the human cost of library processing, not to mention the extra time that NGS takes relative to array prep, and arrays still tend to come out as preferable when putting together a budget and timeline for a study.
So, I would say it depends a great deal on the specifics of the study - how many samples and replicates being a big part of it (and for many type of analyses, there is absolutely nothing about NGS data that allows you to skimp on the biological replicates needed).
If your target is gene expression profiling, both the techniques of microarray as well as NGS are equally useful. But obviously the restriction of a closed study remains with microarray. For NGS again not necessarily all genes will be covered (depends on the probes). From my opinion it is a better option to go for SAGE rather than microarray if you are also looking for a cost effective approach.
There are more and more evidences that these technologies complement each other as both having strengths and weaknesses (eg. http://www.biomedcentral.com/1471-2164/13/629/)
Short answer: NSG (RNA-seq) enables more transcript coverage for comparable cost, but also the analysis pipeline is less estabished and more difficult for some to convert into meaningful results.
If your primary goal is to test a hypothesis about biological response (e.g., do hypoxic cells produce more free radicals?), microarrays will help you answer it with less work. If your primary goal is to discover new and interesting things about your system (non-coding RNA, alternative splicing, etc), then NGS is the way to go.
For organisms that you don't have the genome sequence the best approach would be using sequencing for getting the genome sequence, then RNA-Seq for obtaining the gene models (or confirm existence of in silico predicted ones). However, for 'final' gene expression profiling I would design custom microarray or go for targeted sequencing.
Microarrays are a waste of money, with the possible exception of Roumen's scenario where you need to compare a large number of conditions. MIcroarrays are technologically inferior to RNA-Seq in every sense. Consider that their dependence of fluorescence signal is intrinsically indirect, far less quantitative, exhibits poor coverage of novel genes or LINCs, and is minimally informative about splice variants. I have heard more and more senior PIs publically disavow microarrays during seminars, and a quick survey of the literature speaks for itself. Compare the number of high profile papers that depend on RNA-Seq vs. microarray. I don't think the choice could be clearer.
Daniel, I don't agree with some of your arguments leading to your IMHO too harsh conclusion. (1) Although RNAseq is based on counting sequences, these sequences have to be generated, and it is not clear how this process maps the amount of RNAs. (Non-linear) relationships from signals and amounts of RNAs in microarrays can be easily mapped by spike-in controls genetrating calibration curves. (2) Exon arrays do allow for quantifying differential splicing. (3) A survey of recent "high profile" papers does NOT demonstrate any superiority of a method but rather a novelity, an academic "sexyness" of the method. Neither 2D protein gels nor mass spec have effectively substituted Western blots; DNA microarrays have not substituted real-time PCR and RNAseq will not substitute DNA microarrays.
Unfortunately I do not have anything newer from high impact journals than this:
http://www.pnas.org/content/108/9/3707.short
However, in a few weeks/months there should be set of publications from SEQC/MAQC-III consortium which should bring some light on this issue.
From our experience we see that the problem of RNA-Seq is a high noise for low expressors, which is a direct consequence of random sampling nature of sequencing process (http://bioinformatics.oxfordjournals.org/content/27/13/i383.abstract).
@Jochen- I appreciate your vigorous defense of microarrays. In the context of this debate, it is good to see strong advocates on both sides. However, having worked with both datasets, I'm not formulating my opinion just on what's hot right now in the literature. That being said, scientists vote with their feet, and I think the growing popularity of RNA-Seq is not simply a passing fad.
Some additional important points:
1) The goal of most expression studies is to report both abundance and identity of the RNA species present. Why accept an indirect substitute as in microarray?? In the latter, we can never know for sure what our probes are hybridizing to, and correcting for non-linear relationships between mRNA abundance and fluorescence signal is non-trivial. Moreover, annotations for probe sets are inaccurate. So RNA identity is ambiguous, and we're forced to use a proxy measurement (fluorescence as a function of hydridization) for RNA abundance.
2) WIth RNA-Seq, "what you seek is what you get" (pun intended). It is a direct catalog of all RNA species present in your library. This makes for comparisons across different platforms, cell types/tissues, investigations much simpler. With microarray datasets, independent labs have no way to assess the complexity/quality of the input RNA libraries from someone else's data, and have to rely on all sorts of thorny assumptions and normalization procedures to compare physically distinct arrays. However statistically sound, there is no way to argue that all the data transformations and normalization procedures performed on fluorescence hybridization signal do not lead to loss of biological signal/insights.
3) Microarrays are intrinsically limited in what transcript and transcript variant they can assay. There are no such a priori limitations for the RNA-Seq approaches- if you're sequence maps to the genome, you can find them, whether or not they have been previously predicted/reported.
4) RNA-Seq can be adapted to a whole suite of emerging technologies such as GRO-Seq that enable investigators to capture snapshots of active gene transcription. Microarrays, in contrast, are relegated to studying RNA abundance as an average metric that reflects the history of gene expression, and therefore has much more limited temporal resolution. While average gene expression is not without importance, the fact is that you can compare RNA-Seq with GRO-Seq or other variations on NGS RNA methods. MIcroarrays lack such versatility, and I believe this will cripple their utility in future research.
Can arrays, specifically custom or focused arrays, have some role in future research? Sure, when well validated by a lab, they can be a useful screening tool when you know in advance what genes encapsulate the response of interest. However, if you are coming from the vantage point of performing de novo investigations, then no, I don't think you can put the two platforms on equal footing. Not by a long shot.
It is a pleasure to announce that during the Highlight Track of the ISMB 2014 conference we will give a talk where we will present the key findings of the SEQC/MAQC-III Consortium (http://www.fda.gov/ScienceResearch/BioinformaticsTools/MicroarrayQualityControlProject/#MAQC-IIIalsoknownasSEQC).
The main manuscript of the SEQC Consortium:
"A comprehensive assessment of RNA-seq accuracy, reproducibility and information content by the Sequence Quality Control consortium"
is already at the copy-editing stage in the Nature Biotechnology and should be available shortly.
We therefore invite you to take part in the talk (HT-PP27 more details below) and following discussion as well as visit our posters adversing selected key results of the study: F45, F46, F47, F48, and N56
PP27 (HT)
Power and Limitations of RNA-Seq: findings from the SEQC (MAQC-III) consortium
We present an extensive multi-centre multi-platform study of the US-FDA MAQC/SEQC-consortium, introducing a landmark RNA-Seq reference dataset comprising 30 billion reads. Several next-generation-sequencing, microarray, and qPCR platforms were examined. The study design features known mixtures, wide-dynamic range ERCC spikes, and a nested replication structure -- together allowing a large variety of complementary benchmarks and metrics. We find that none of the examined technologies can provide a ‘gold standard,’ making the built-in truths of this reference set a critical device for the development and validation of novel or improved algorithms and data processing pipelines. In contrast to absolute expression-levels, for relative expression measures, good inter-site reproducibility and agreement of across platforms could be achieved with additional filtering steps. Comparisons with microarrays identified complementary strengths, with RNA-Seq at sufficient read-depth detecting differential expression more sensitively, and microarrays achieving higher rank-reproducibility. At the gene level, comparable performance was reached at widely varying read-depths, depending on the application scenario. On the other hand, RNA-Seq has heralded a gold-rush for the study of alternative gene-transcripts. Even at read-depths beyond 100 million, we find thousands of novel junctions, with good agreement between platforms. Remarkably, junctions supported by only ~10 reads achieved qPCR validation-rates >80-100%, illustrating the unique discovery power of RNA-Seq. Finally, the modelling approaches for inferring alternative transcripts expression-levels from read counts along a gene can similarly be applied to probes along a gene in high-density next-generation microarrays. We show that this has advantages in quantitative transcript-resolved expression profiling. There is still much to do!
Pawel, are their other manuscripts coming out of the MAQCIII/SEQC project as well? If so will they be in different journals or all together in one volume as before?
All other should be published in Nature Biotechnology. Probably they will appear at different time on-line but they might go out together in one paper issue.