I would like to know if the mean of my delta delta Ct for my control group (non treated) normalized to the endogenous control must always be equal to 1.
So you are talking about delta delta Ct. To answer your question, it is worthwhile to make clear what this means. The delta delta Ct is the difference between two delta Cts, and these, in turn, are calculated as the difference betwenn Cts. Let us abbreviate "delta" by "d", so we have
ddCt = dCt[A]-dCt[B]
where A and B denote two different conditions, like treated and control, or diseased and healthy. Further we have
dCt[A] = dCt[A,ref]-dCt[A,goi]
dCt[B] = dCt[B,ref]-dCt[B,goi]
where goi and ref denote that the Ct values are from the gene of interest and the reference gene, respectively. Both genes are measured in either condition A and B.(*)
Now the most important and most difficult part to understand is the meaning of a Ct value itself. A Ct value is a value proportional to minus the logarthim of the initial amount/concentration of the amplicon sequence.
The ddCt method is valid only if we have resons to assume that the amplification efficiencies of goi and ref are equal. If you have resons to suspect that these are not equal, the simplifications from the original exponential equations leading to dCt and then to ddCt are not possible, and the math is a little more complicated. This is what Jack refers to as the "Pfaffl method". But since you were talking about the ddCt method, let us assume the amplification efficiencies are identical.
Therefore it follows that a dCt is proportional to the log ratio of concentrations for the two genes in one condition (recall that the log of a ratio equals the difference of a log: log(x/y) = log(x)-log(y)). When the dCt values are calculated as shown above, the negative signs of the proportionalities cancel out, so that a higher dCt indicates a higer relative (or normalized) expression: the expression of the goi normalized to the expression of the ref gene. The ref gene is used as a "loading control". The dCt value is thus a normalized log expression of the goi. However, it still contains unknown proportionality factors. So we must more correctly state that the dCt is proportional to the log normalized expression (of the goi).
Such a dCt is obtained for both conditions, A and B. Whatever this unknown proportionality factor is, it is the same for both conditions (because the same genes are measured and the dCts are derived in the same way!), and thus it will cancel out when the two dCt values are subtracted to get the ddCt value.
The ddCt is thus a log-ratio of the normalized expressions of the goi under the two conditions. Because all unknown proportionality factors are cancelled out, it really directly is a log-ratio of expressions. The base of the logarithm equals the amplification efficiency. This is actually unknown (unless explicitely determined) but often assumed to be 2 (=100%, doubeling of all amplicons in each cycle). Therefore, 2^ddCt will give you the fold-difference in expression from contition A versus condition B.
A ddCt of 0 indicates no change at all (same normalized expressin of the goi in both conditions): 2^0 = 1 (the expression in A is as high as in B).
Positive ddCt values indicate a higher expression under A as compared to B, e.g. for a ddCt of +1 the fold-difference is 2^(+1) = 2 (the expression in A is twice as high as in B).
Negative ddCt values indicate a lower expression under A as compared to B, e.g. for ddCt = -1 the fold-differemce is 2^(-1) = 1/2 = 0.5 (the expression in A is half the expressin in B).
(*) IMPORTANT: In contrast to Livaks original paper on the ddCt-method, I exchanged the order of the terms to get the dCt values. Livak was actually calculating the concentration of the ref gene normalized to the concentration of the goi - what is rather counter-intuitive. This "mistake" was corrected in the final calculation of the ddCt, what he gets as 2^(-ddCt) [mind the "minus" sign!].
Ok, now that we elaborated this a little, we can come back to your question:
"I would like to know if the mean of my delta delta Ct for my control group (non treated) normalized to the endogenous control must be Always equal to 1."
As you may recognize, the ddCt already gives a log ratio for the normalized treated to the normalized controls. (The normalization is done to the reference gene). So the question makes not much sense.
If I assume that you are asking for the dCt values instead (i.e. the normalized log expressions for either condition), the answer is: NO. There is NO reason that dCt values need to be 1. They can be any positive or negative number, depending on the relative expression of the two genes (goi and ref), what strongly depends on your arbitrary choice of a ref gene. Note that a value of 0 also has no meaning. It does not even mean that both genes (goi and ref) are expressed at a similar level, since the dCt still contains some unknown proportionality factor. Similarily, a dCt of 1 also does not mean anythin particular (e.g,. that one gene is expressed twice as strion as the other). A single dCt is simply not interpretable. You can only interpret *differences* between dCt values (from different conditions); this way you are actually looking at ddCt values that do have a meaning, namely the log fold-difference of the normalized expressions.
After all samples are normalized to their respective sample reference gene ("housekeeping gene"; "endogenous control gene"), then, when dividing all by the ref. gene normalized control, the control (sometimes called calibrator), when divided by itself (as is the course of the usual calculations) by definition, always becomes "1".
See the Pfaffl (New Mathematical Model for...) 2001 paper - should explain things for you. You can use whatever scale you want of course - but I think this is the answer you were seeking?
So you are talking about delta delta Ct. To answer your question, it is worthwhile to make clear what this means. The delta delta Ct is the difference between two delta Cts, and these, in turn, are calculated as the difference betwenn Cts. Let us abbreviate "delta" by "d", so we have
ddCt = dCt[A]-dCt[B]
where A and B denote two different conditions, like treated and control, or diseased and healthy. Further we have
dCt[A] = dCt[A,ref]-dCt[A,goi]
dCt[B] = dCt[B,ref]-dCt[B,goi]
where goi and ref denote that the Ct values are from the gene of interest and the reference gene, respectively. Both genes are measured in either condition A and B.(*)
Now the most important and most difficult part to understand is the meaning of a Ct value itself. A Ct value is a value proportional to minus the logarthim of the initial amount/concentration of the amplicon sequence.
The ddCt method is valid only if we have resons to assume that the amplification efficiencies of goi and ref are equal. If you have resons to suspect that these are not equal, the simplifications from the original exponential equations leading to dCt and then to ddCt are not possible, and the math is a little more complicated. This is what Jack refers to as the "Pfaffl method". But since you were talking about the ddCt method, let us assume the amplification efficiencies are identical.
Therefore it follows that a dCt is proportional to the log ratio of concentrations for the two genes in one condition (recall that the log of a ratio equals the difference of a log: log(x/y) = log(x)-log(y)). When the dCt values are calculated as shown above, the negative signs of the proportionalities cancel out, so that a higher dCt indicates a higer relative (or normalized) expression: the expression of the goi normalized to the expression of the ref gene. The ref gene is used as a "loading control". The dCt value is thus a normalized log expression of the goi. However, it still contains unknown proportionality factors. So we must more correctly state that the dCt is proportional to the log normalized expression (of the goi).
Such a dCt is obtained for both conditions, A and B. Whatever this unknown proportionality factor is, it is the same for both conditions (because the same genes are measured and the dCts are derived in the same way!), and thus it will cancel out when the two dCt values are subtracted to get the ddCt value.
The ddCt is thus a log-ratio of the normalized expressions of the goi under the two conditions. Because all unknown proportionality factors are cancelled out, it really directly is a log-ratio of expressions. The base of the logarithm equals the amplification efficiency. This is actually unknown (unless explicitely determined) but often assumed to be 2 (=100%, doubeling of all amplicons in each cycle). Therefore, 2^ddCt will give you the fold-difference in expression from contition A versus condition B.
A ddCt of 0 indicates no change at all (same normalized expressin of the goi in both conditions): 2^0 = 1 (the expression in A is as high as in B).
Positive ddCt values indicate a higher expression under A as compared to B, e.g. for a ddCt of +1 the fold-difference is 2^(+1) = 2 (the expression in A is twice as high as in B).
Negative ddCt values indicate a lower expression under A as compared to B, e.g. for ddCt = -1 the fold-differemce is 2^(-1) = 1/2 = 0.5 (the expression in A is half the expressin in B).
(*) IMPORTANT: In contrast to Livaks original paper on the ddCt-method, I exchanged the order of the terms to get the dCt values. Livak was actually calculating the concentration of the ref gene normalized to the concentration of the goi - what is rather counter-intuitive. This "mistake" was corrected in the final calculation of the ddCt, what he gets as 2^(-ddCt) [mind the "minus" sign!].
Ok, now that we elaborated this a little, we can come back to your question:
"I would like to know if the mean of my delta delta Ct for my control group (non treated) normalized to the endogenous control must be Always equal to 1."
As you may recognize, the ddCt already gives a log ratio for the normalized treated to the normalized controls. (The normalization is done to the reference gene). So the question makes not much sense.
If I assume that you are asking for the dCt values instead (i.e. the normalized log expressions for either condition), the answer is: NO. There is NO reason that dCt values need to be 1. They can be any positive or negative number, depending on the relative expression of the two genes (goi and ref), what strongly depends on your arbitrary choice of a ref gene. Note that a value of 0 also has no meaning. It does not even mean that both genes (goi and ref) are expressed at a similar level, since the dCt still contains some unknown proportionality factor. Similarily, a dCt of 1 also does not mean anythin particular (e.g,. that one gene is expressed twice as strion as the other). A single dCt is simply not interpretable. You can only interpret *differences* between dCt values (from different conditions); this way you are actually looking at ddCt values that do have a meaning, namely the log fold-difference of the normalized expressions.
The whole dCt method works when the efficiencies of both primers are similar (ideally should both be close to the ideal of 2.0 or 100%). If this is the case, it is absolutely ok to compare dCt values between groups with a t-test, as the assumption of a normal distribution of dCt values is not unreasonable. The point estimate of this comparison is the ddCt value.
Pfaffl proposed a method that should give more correct results when the amplification efficiencies are not identical (and, thus, suboptimal for at least one primer system). In this case, the dCt value would depend on the (usually unknown) sample concentration, and therefore the dCt values are essentially meaningless for quantification. If one knows the amplification efficiencies very precisely, then one can calculate estimates of the initial target sequence concentrations directly from the Ct-values, and one can then calculate the ratios of such intiial concentrations to normalize the value for the goi to the reference gene. This is what the Pfaffl method does. It is not about dividing dCt values; it is not about dCt values at all.
My personal concern with this is that I would not trust a PCR that has a less then ideal efficiency. I would doubt that the efficiency is constant throughout the early cycles of the PCR, and if the amplification process is stable. If I would observe a stable system but with sumoptimal efficiency, I would first think of a systematic problem with the assay that systematically impacts the Ct values (bad background correction, unspecific amplification, problems with signal generation / quenching etc.). I would not trust the assay to be used for quantification. I would start testing different primers and find an assay that is robust and reliable and in agreement with the assumptions of an "ideal PCR". Having such an assay, I would no need to think about the Pfaffl method and simply use dCt values to learn about the gene regulation.
If you first convert all Cq values for all targets (and ref genes) in your assay to their 100% efficient incarnations, this takes efficiency into account up front before proceeding with your dCq, ddCq or other analyses.
For instance, if you observe a Cq of 23, but know that the efficiency of the reaction is 93% - thus has an EAMP value of 1.93, you would convert the observed Cq of 23 to its 100% efficient incarnation as follows:
23*log2(1.93) = 21.81782
Do this for all your Cq values before processing, and you will have taken efficiencies into account for all...
1) do NOT divid Ct values. Please make yourself familiar with the meaning of a Ct valiue: it is related to -log(Conc). A difference of Ct values is thus related to a log-ratio of concentrations. A ratio of Ct values would mean some strange root of a concentration.
2) for each sample you get a dCt value. You do the stats (t-test or whatever) on these values.
As Jochen has also mentioned (in addition to his immediate post above), if your reaction efficiencies are highly questionable for any of your target or reference genes, your best option is to redesign primers/primers-probes for the problematic targets or reference genes, and re-assess.
Ideally, to assess efficiencies by a relative dilution standard curve approach, one would prepare at least 8 technical replicates at each dilution point until nil reaction behavior (Poisson noise) begins to rear its head. At that point, a minimum of 16 technical replicates would be required; although 48 technical replicates would be better suited in determining single copy reactions.
In reality, many investigators never pursue these latter courses of action as concerns over cost and time needed to optimize conditions for all targets and reference genes and identify the Poisson limits for each seem to over-ride the situation.
Efficiency of amplification as demonstrated by a standard curve generated by a serial dilution of a positive sample and when plotting Cq vs log of dilution factor:
Standard curve Efficiency (E) = [10(-1/slope) – 1]
and Exponential amplification (EAMP) = 10(-1/slope)
If you included serially-diluted standards for each target on your plates, your machine should calculate this for each target.
If the machine says your efficiency is 97%, then you know that EAMP = 1.97 (according to the equations above).
EAMP is the value used in quantification equations for qPCR.
And, for any standard curve - whether based on absolute copies or relative serial dilutions: Initial reaction copies or initial relative amount of target = EAMP(b-Cq)
When b = the estimated (forecast) Cq value for 1 copy, then this equation solves for initial reaction copies. But when b = the y-intercept for a serial dilution series, that y-intercept is only a relative value somewhere in space - generated by the most concentrated sample on the standard curve.
Is the "non-normality" relevant? could you provide a normal-Q-Q plot of the dCt values? What is the sample size?
In principle, CIs can be determined by bootstrapping. The only assumption made is then that the distribution of the sample is representative and detailed enough (that is, a rather large sample size is required*). I don't know how the CIs you got were determined, so I can't say if they are meaningful.
* For large sample sizes, the standard statistics work well, because the central limit theorem ensures that the distribution of the statistic (i.e. the mean ddCt) will well approximate the normal distribution, even when the data distribution (dCt) is not a good approximation.
You write n=5 per group and the plots show 10 values each. The empirical quantiles seem to be raw Ct values. So I assume what we see in these plots are actually two distinct groups of ct-values. That's expected if there is a group(treatment)-effect, in which case the overall-distribution cannot be normal (it should be bimodal - that's what you see on the plots). And in fact it is not the overall-distribution, but rather the conditional distribution (conditional on the group/treatment), that should be normal. Here you have only n=5 points, what really is too little to judge a distribution (to judge if some assumed distributional model is reasonable). The usualy way to handle this is to check the residuals of the fitted model, rather than the raw Ct values. Including "assay" besides "group/treatment" as an additional factor in the model allows you to get the residuals from all 4 assays together, so that you had 4x10 = 40 points (residuals) to plot in one QQ-plot, what will give you a far better impression of the distribution (allows you a better judgement if the assumption of a conditional normal distribution is reasonable).
If you use R, you may have the Ct values (for all assays and treatments: 40 values) in a numeric vector ("ct"), and the assay and group memberships in factors ("assay" and "group"). The model fit would then be achieved by
The QQ-plot looks good. There is nothing that should make you believe that the normal distribution assumtion was unreasonable. I would go for a standard analasis (you already have calculated the model, so you are actually done already).
One thing (I forgot last time): if the 4 assays are for 4 different target genes, then I would change the model to dct ~ assay + assay:group, as the group effect should be allowed to be different for each of the target genes. The model will have a coefficient for each assay:group interaction, which are the ddct values giving the log fold-change of the treatment for the respective gene (assay). If "assay" just means an independent replication (same genes), then the model ct ~ assay + group is fine, and there will be a single coefficient for "group" which is the ddct (for the gene).
a) there is no need to "propagate" the standard errors. These are given together with the estimates (i.e. ddct) from the model fit. You can also use the function confint(model) to get the confidence intervals of the estimates (somewhat preferrable to the standard errors, but sadly less commonly used).
b) I am not a big fan of efficiency corrections. To my opinion, if you are not able to achieve an efficiency of virtually 2.0, then I doubt that the reaction is robust, stable, and runs with a constant efficiency throughout. And there is a considerable risk of severely biasing the results when the efficiency value used for correction is wrong itself. However, given 2.0 is (for whatever reason) a reasonable estimate of the efficiency, then the ddct values are log2 fold-changes. I would always plot log fold-changes (i.e. ddct values) and never fold-changes. You can not transform the standard error from the ddct scale to the fold-change scale, you can only transform the limits of the confidence interval, as you wrote in your last sentence (so that would be appropriate; I just don't like plotting fold-changes at all, because this is distorting the visual impression of the biological relevance of these changes [that relevance is far better represented on a log-scale]).
You usually have several control samples and several treated samples. The simplest way to get the mean ddCt is after averaging the sample dCt values first and then subtract the mean dCt values. There are other ways that lead to the same result, because addition/subtraction is commutative.
Sir, I meant to ask that should we include two control (untreated) samples along with the treated ones?
My experimental design spans one control (0 hours teatment/untreated) sample and three treated samples of 6, 12 and 24 hours of treatment.
There is one biological and three technical replicates.
Normally, according to your previous suggestions on qPCR, I understood that the control samples should be two whose ddCt is subtracted from the dCt of treated samples to get their ddCt.