I have a set of aligned sequences in fasta format. I want to get consensus out of the alignment. In case of most of the sites one of the base is showing maximum occurrence. In case of sites where two or more bases occur equal number of times, which base should be taken? An example is given below:
>Seq_1
ATGCGA
>Seq_2
AT-CGT
>Seq_3
AT-CCG
>Seq_4
AT-CCC
>Seq_5
AA-CT-
As per the conventions this will be the consensus
Consensus : A T G C [G/C] N
But this output of the consensus sequence will throw an error when aligned with other sequences. So what should be done in such scenario and how to get consensus for such sites?