My brain goes into the spinning death wheel mode when I compare two VCF files because they should have approximately the same megabytes:
VCF file1 ~ 70 Mb, data processed with HaplotypeCaller (GATK).
VCF file2 ~ 200 Mb, data processed with mpileup (Bcftools).
The same BAM file was processed with these two different variant calling tools to create the VCF files. I used IGV to compared the BAM file with both of the VCF files, and to my astonishment I see that the BAM file shows all the variants but few of them are picked up by GATK and all of them by Bcftools. I was looking into small regions (therefore, when I said all of them I meant all of the variants visible in the IGV window). I did not want to compare both of the VCF files. I needed a visual assessment.
Why is HaplotypeCaller not calling those variants that mpileup is calling?
Have you have the same experience?
Is there another variant caller tool that might make this more difficult or easier?
Why are there so many variant caller tools?
Can we just rely on one? If we cannot, can we make one that can use the best of all of them and exclude the worst of all of them?