Recently I sequenced a fungal genome using Ion/PGM technology. I have a .bam file and I used it to extrapolate consensus FASTA sequence. In the .fq file I found both a,t,g,c (lowercase) A, T, G, C (uppercase) nucleotides but I do not understand what they mean. Uppercase for high coverage and viceversa? In addition, I found a lot of cyrillic characters (+, @, !!!!!, ) between sequences of @supercontig: what do they mean? I hope someone can help me as I'm new to NGS.
Is there any way to extrapolate DNA sequences >200 bp between n's from FASTA file?