I have two different tumor types, Tumor A and Tumor B, with 10 samples for each tumor type. I ran qPCR for a panel of 45 genes and I would like to plot the expression values in a heat map to visualize the data. My question is, how do you properly transform and normalize the data before plotting it in a heat map, dot plot, or for significance testing? My experimental design is as follows:
1)Actin as reference gene
2)Compute delta CT for each gene, i.e. CT(gene1,sample1)-CT(actin reference, sample1), I do this for all samples.
This is where i am unsure of what to do next...What I currently have done is this..
3) compute the average delta CT for group A tumor type
4) subtract the average delta CT from group A tumor type from each delta CT value for each sample in group A and group B? i.e. deltadeltaCT for gene1 = AvgdeltaCT groupA - deltaCT groupA sample1 and do the same for samples in group B?
5)compute the 2^-ddCT for each sample
6)plot these "fold change relative to Tumor A" values in heat map
When i follow this approach, i have some fold change values that are many times higher than most values, i.e. 25, when the average fold change for all samples across both tumor types is ~2. This creates a skewed dataset when making a heatmap, causing all the samples to appear "cold" or "low" when in fact they are not. This makes me think that i need to normalize the data on a fixed scale (quantile normalize?) of say -3 to 3, and set all values above a certain cutoff equal to a max value, say 3 and set all values below a certain cutoff equal to a min value, say -3. Is this correct to assume?
What i want to accomplish is to show that the gene expression signatures of group A are different from group B. Im not necessarily attached to representing it as fold change, i am just looking for the best way to represent the data to accomplish my goal.
Someone mentioned log transforming my 2^-ddCT values--> subtracting the mean across the entire dataset from each value-->quantile normalize . Would this be an appropriate approach using the data i have currently generated?
Thanks for the help!