After bwa assembly to the reference genome I end up with sorted and indexed BAM files, but I am interested in understanding how to properly obtain all exon consensus sequences (to produce protein-coding sequences) for further analyses.
I have to do this for about 20 species and then I need to align all the exons to produce individual protein-coding gene multiple sequence alignment files. Are there are useful tools that I can use to automate these last few steps?
Thanks!