Hello,
I have this sample that was sequenced as two different libraries.
I do not have access to the BAM files, only PLINK files.
Each PLINK file has that specific sample as its single individual.
Each PLINK file has a different number of SNPs, therefore some might be common to both files and others not.
Can I merge the two PLINK files in a way that I get all SNPs, choosing at random from which file to keep the duplicated ones?
E.g.
SampleA1 - 400000 SNPs
SampleA2 - 150000 SNPs
As some SNPs will only exist in one file and not on the other, I wish to merge the files and keep all of them, discarding for example the duplicated on on the second sample.
Thanks!
EDIT: As mentioned above, I really needed to keep single alleles in all variants, hence the need to discard one of two, which would in cases lead to some biallelic variants. This is because my data was/is pseudo-haploid. If this is irrelevant for you, you can use PLINK's (b)merge instead.