I am trying to make a paired matrix of gene-gene correlation. Considering that I have a huge matrix (13000 genes and 900 samples) and for some reasons I don't want to decrease the number of my genes, my gene-correlation matrix would be 13000*13000 and my paired matrix will become 169 million *4 (Column 1: Gene 1; Column 2 : Gene 2; Column 3: Correlations; Column 4: P-values) . In this case, I have to exclude unnecessary calculations as much as I can. I have excluded the situation that Gene 1 = Gene 2. But I couldn't find a way to exclude the condition that "Column 1: Gene 1 ; Column 2: Gene 2 = Column 1: Gene 2; Column 2: Gene 1 ". To make a long story short, correlation between G1 and G2 is equal to G2 and G1. It is like calculating just lower section of diag in a symmetric matrix. I would be grateful if anybody help me in this case. I have enclosed my python codes in picture and context for your convenience:

import pandas as pd

import numpy as np

import scipy

import math

import openpyxl

from openpyxl import Workbook

from scipy.stats import spearmanr

.

.

din=pd.read_csv('m_test.csv', index_col=0)

.

out=pd.DataFrame()

outdf=pd.DataFrame()

.

for g1 in din.index:

for g2 in din.index:

temp=din.loc[[g1, g2]]

if g1==g2:

next

else:

spR, spP=spearmanr(temp.loc[g1], temp.loc[g2])

frame={'g1':[g1], 'g2':[g2], 'spR':[spR], 'spP':[spP]}

out=pd.DataFrame(frame)

outdf=pd.concat([outdf, out])

More Elyas Mohammadi's questions See All
Similar questions and discussions