I have ~17k amino acid sequences in FASTA format in a single file. Using following command of Clustal Omega on Linux system, I created the distance matrix;
clustalo -i filename.faa --distmat-out=filename.faa.mat --full
It created a matrix of 17k fields x 17k records. The matrix is displayed in following format;
A B C D E ....
A 0.000 0.136 0.227 0.476 0.864
B 0.136 0.000 0.318 0.571 0.864
C 0.227 0.318 0.000 0.238 0.773
D 0.476 0.571 0.238 0.000 0.857
E 0.864 0.864 0.773 0.857 0.000
.
.
Manipulating this type of data is difficult for me as many values are repeating. I want the distance matrix to be drawn in following format;
A B 0.136
A C 0.227
A D 0.476
A E 0.864
B C 0.318
B D 0.571
B E 0.864
C D 0.238
C E 0.773
D E 0.857
.
.
Dealing with this data containing info of 17k sequences will be relatively easy for me.
Can anybody help me how to convert the format of distance matrix.