I want to choose several SNPs in a gene for association studies,with previous positive results firstly. But if these SNPs are not tag SNPs, should I consider the LD?
my opinion is, first choose SNPs if they are published and showed interesting results or functionality, however, do not genotyped SNPs in LD (r2>0.8). when you start a project with new genes without prior knowledge select tag SNPs.
I agree with Gloria about selecting SNPs that have previously shown association in other studies, then choosing SNPs that look like they might be functional however I do not think the r2 value is particularly useful as it is so dependent on allele frequency. You could have a pair of SNPs with an r2 value of 0.1 yet they could be in complete disequilibrium and it would be no point using both (see my answer to Kerry Pettigrew's question on the use of D' or r2 values (see the link below). The D' is a more direct measure of whether alleles in two SNPs are co-inherited (ie in disequilibrium).
When choosing the SNPs you should also consider allele frequency, the higher the minor allele frequency (MAF), the more likely you are to see an effect. However, don’t discount SNPs with a low MAF as you can sometimes find rare alleles with a large effect (high odds ratio). Generally, I would look for SNPs with a MAF greater than 0.1 and a D' below about 0.3 but you may find useful SNPs where the D' is greater in cases than in controls, eg we found useful SNPs with a D' of 0.32 in controls and 0.76 in cases which is a good indication that , while they sow some LD, they are both detecting something different in the association study.
thanks everyone for these helpful replies. There are, in a gene, 3 SNPs which have shown association in earlier studies. From hapmap data, the pair-wise D' is about 0.9 and r2 is about 0.6. should i still genotype the 3 SNPs in our new sample or just choose one?
Flora, you should take into account the frequency of the three SNPs, are they >0.1, how large is your DNA collection, the larger it is the more robust results you can obtain, are any of the three SNPs more relevant than the others? non-synonymous, 3 or 5'UTR?? Where are they located on the gene?? Considering that there are only three you could genotype them all and you could perform single association studies and haplotypes to see if enlarge the power of the study....
I imagine with an r2 of 0.6 the SNPs must have reasonably similar allele frequencies. With a D' of 0.9 they are almost in total disequilibrium so there is almost no point including more than one, but which? I agree with Gloria that you should look at whether they might be functional but if there is no other indicator then just choose the SNP with the highest allele frequency as this is likely to represent the ancestral haplotype. Again I agree with Gloria that the more subjects you have the greater the power to detect association but don't sacrifice good phenotyping for the sake of increasing numbers. If the phenotype is not clearly defined you might just as well select random controls.