I am reporting some cross-phenotype genetic correlations (LDSC method) that are statistically significant after multiple test correction. Peer reviewers have asked us to report top SNPs that might contribute to these correlations. I understand that genetic correlations reflect a genome-wide effect, but I want to satisfy reviewers/readers. Has the field arrived at a best practice for identifying/prioritizing SNPs/regions that mediate a significant genetic correlation? I have a few ideas below, but hope to get feedback from the community of experts.

1) It would be simple to threshold the 2 sets of GWAS results at some p-value (arbitrary threshold?) and compare the overlap of the two results. The surviving results could be clumped to LD-independent markers and nearby genomic features could be reported. Simple, but arbitrary selection.

2) Meta-analysis has been a common approach for combining data across phenotypes. We might see some signals persist or become more significant if a SNP is associated with both phenotypes. We also might see some associations driven by only one phenotype, so this method isn't exactly specific to SNPs contributing to correlation. Overall, this approach doesn't seem to answer the exact question I'm asking. From my cursory understanding, an MTAG analysis also wouldn’t exactly answer my question.

3) pHESS is a new method that seems promising for addressing my question. https://www.biorxiv.org/content/biorxiv/early/2016/12/08/092668.full.pdf; however, this approach does not control for ancestry stratification as nicely as the LDSC method.

As an extension on this question: Have we arrived at methods for assessing whether the SNPs/regions mediating a genetic correlation are significantly enriched for biological pathways/functions? The approach laid out here seems relevant, but I haven't had a chance to read the article (https://www.biorxiv.org/content/early/2017/03/07/114561)

More Daniel Tylee's questions See All
Similar questions and discussions