Can anyone tell me how to achieve the deviation of NGS for gene expression profiling? In our case, TruSeq RNA and DNA samples Preparation Kits v2 and illumina were useless.
The biggest variability is on RNA extraction. This needs to be done as systematically as possible. Preferably: same bench, same person, same time. If this is not possible (eg experimental design too big) - do the RNA extraction in a randomized way (eg one from each treatment group in one daily extraction batch). Experimental design is important - plan the experiment big, keep the globally extracted RNA in the fridge, select for the library prep and sequencing the part of the samples that are needed in the first place.
Library prep and sequencing result in much smaller bias than RNA extraction.
If the kit does not work in your hands - change it, or learn hands-on in a lab where this particular library prep works :)
Btw - in Wien you have a world-class group doing RNAseq analysis - at BOKU (see their set of SEQC consortium papers from last month), they know many such details...
Actually, I'm asking if anyone knows how is the deviation of gene expression profiling by NGS. That is, if you run a samples 5 times, what is the standard deviation?
If your results from all sampled times (biological replicates is of interest in most cases) passed QC, then almost all methods for analyzing NGS data would calculate the standard deviation or confidence intervals for each gene. You may want to try one of the programs of analyzing RNA-seq data, LOX, that is developed by Dr. Townsend at Yale.
It sounds a bit like you're heading towards wanting to use RNASeq NGS for quantitative gene expression analysis. Because of the inherent variability noted by Michal and Zheng, you're always going to have trouble comparing run to run. You can really only compare within a given run to other genes based on read depth, and you hope to compile enough biological and technical replicates to make statistical significance out of that. There was a lot of discussion starting a few years ago about how you normalize NGS data for statistical and relative analysis. There's still not a good answer for that. Any time you have PCR as part of your workflow, you lose your tether to true comparative quantitation.
Just like microarray analysis, you will need to perform multiple independent biological replicates according to your pre-planned experimental design (e.g., randomized block, split block, split-split-block ....). Biological variation will always be the greatest source of variability. The first thing is to get a high-quality dataset; then you can mine over and over for the parameters you are interested in. Talk to a statistician that works with NGS datasets.