I am using bwa for the mapping of single end reads to the reference genome using following commands.
bwa-0.7.5a/bwa index -a bwtsw ref.fna
bwa aln ref.fna reads.fq > in.sai
bwa samse ref.fna in.sai reads.fq > out.sam
samtools view -S out.sam -b -o out.bam
samtools sort out.bam out.sorted.bam
bam2fastq -o reads.fq --no-aligned out.sorted.bam
samtools mpileup -uf ref.fna out.sorted.bam | bcftools view -cg - | vcfutils.pl vcf2fq > final.fastq
seqret -osformat fasta final.fastq -out2 final.fa
My final output file look like this nnnnnnnnnnncgctagTGACATATATATctaaaaaaaagctTTGCC.
In my final output file (final.fa), I found that there are a lot of lowercase bases and the fa file is a mixture of small n, upper case bases and lowercase bases! What is the actual meaning of lowercase bases present in the file? Do they relate to the quality of information? Should they be discarded or translated to upper case? Note: My reference genome (ref.fna) file does not contain any lowercase bases.