I find that PLINK provides a simple means to generate scores or profiles for individuals based on an allelic scoring system involving one or more SNPs. I can use to assign a single quantitative index of genetic load and to build multi-SNP prediction models, for set up a quick way to identify a list of individuals containing one or more of a set of variants of interest.
./plink --bfile mydata --score myprofile.raw
which takes as a parameter the name of a file (here myprofile.raw) that describes the scoring system. This file has the format of one or more lines, each with exactly three fields
SNP ID
Reference allele
Score (numeric)
for example
SNPA A 1.95
SNPB C 2.04
SNPC C -0.98
SNPD C -0.24
These scores can be based on whatever we want. One choice might be the log of the odds ratio for significantly associated SNPs, for example. Then, running the command above would generate a file
plink.profile
with one individual per row and the fields:
FID Family ID
IID Individual ID
PHENO Phenotype for that
CNT Number of non-missing SNPs used for scoring
CNT2 The number of named alleles
SCORE Total score for that individual
The score is simply a sum across SNPs of the number of reference alleles (0,1 or 2) at that SNP multiplied by the score for that SNP. For, example,