I want do this work in R or Linux:

I have a matrix in text format. In this table the rows are contain SNP and columns are contain samples and between these there are genotypes and unmeasured data (A/A,C/T,0/0). Now I want to find variants that are called in 20% of the samples (that is: finding genotypes that observed in 20% of samples in each rows (for each SNP) exceould you please help me?

                         A                              B                                   C                               D

1:14773           0/0                          0/0                                 0/0                             0/0

1:14907          A/G                          A/G                               A/A                           A/C

1:14930          A/G                          A/T                               A/G                           A/G

1:14933          G/A                          0/0                                0/0                            G/A

1:14948          0/0                           0/0                                 0/0                             0/0

Similar questions and discussions