I have the SNP information (VCF) and would like to extract its flanking sequence. How do I extract it from my reference genome? Do I need to write scripts and programs for it?
Thanks Abhijeet Singh ! Why is there a need to convert VCF to Fasta? Fasta is just sequence file right? By the way, my VCF do not have the flanking sequence, but only position, REF/ALT.
Thanks Iman Hassan Ibrahim ! Do I need to install other packages other than pysam in order for the script to run?
Since you don't want just the positional information and want sequence information, converting VCF to fasta based on reference will give you sequence of flanking region.
Further, you can also do it manually by looking at the position and get the flanking region based on reference genome, if you don't want the conversion.