19 July 2021 3 5K Report

Hi all, is there way to filter out all the homozygous genotypes in a multi sample VCF file? in my vcf file I have unknown (./.) genotype, reference genotype (0/0) and two alternative genotypes (1/1 and 2/2). I want to remove these genotypes using SnpSift.jar tool.

I have used the following command to remove them the homozygous genotypes:

$ cat SNPs_reheader_annot_passed.vcf.gz | java -jar \

/home/bandiken/snpEff/SnpSift.jar filter "(countHet() > 0)" \

> SNPs_reheader_annot_Het.vcf.gz

How ever in the final output file (the following screenshot) along with a few heterozygous genotypes in each row, we can also see the other homozygous genotypes (unwanted genotypes in my study).

in the following screenshot, the heterozygous genotypes highlighted in white in each location/row along with multiple homozygous genotypes

so my question is that is it normal?

is there a way to remove the samples with homozygous genotypes without the whole row being removed?

Any help would be appreciated.

Thank you.

Similar questions and discussions