I am trying to optimize the BWA+SAMTools/GATK pipeline for exome dataset. For the same purpose I need to know what is maximum memory BWA uses while executing aln and samse/sampe subroutine. Looking forward to valuable inputs.
The maximum is usually whatever you can spare. The min. requirements according to Heng Li, the author or BWA are "With bwtsw algorithm, 5GB memory is required for indexing the complete human genome sequences. For short reads, the aln command uses ~3.2GB memory and the sampe command uses ~5.4GB". Again note that the requirements are differnent based on reference genome, the amount of reads you have to align, read length etc.