03 January 2022 3 647 Report

I am following the way how a previous paper (PMID: 30948552) treating their spatial transcriptomic (ST) data. It seems like they combined all expression matrix (not mentioned whether normalized or log transformed) of different conditions, and calculate a gene-gene similarity matrix (by Pearson rather than Spearman), and they finally got some gene modules (clustered by L1 norm and average linkage) with different expression between conditions.

So I have several combination of methods to imitate their workflow.

For expression matrix, I have two choice. The first one is a merged count matrix from different conditions. The second one is a normalized data matrix (default by NormalizeData function in seurat, log((count/total count of spot)*10000+1)). For correlation, I have used spearman or pearson to calculate a correlation matrix.

But, I got stuck.

When I use a count matrix, no matter which correlation method, I get a heatmap with mostly positive value pattern, which looks strange. And for a normalized data matrix (only pearson calculated), I got a heatmap with sparse pattern, which is indescribably strange too.

My questions:

  • Which combinations of data and method should I use?
  • Would this workflow weaken the correlation of the genes since some may have correlations only in specific condition?
  • Whatever you think of my work?
  • Looking forward to your reply!

    Similar questions and discussions