I have an .arp file containing around 22,000 mtDNA sequences, the sequences are 244bp in length. I have been trying to compute 'pairwise FST's' and the 'number of shared haplotypes' but this has not worked.
Prior to this analysis I have rigorously screened the file for potential problems using the Arelquin log file as a guide. The .arp file is ready for analysis and WILL compute standard and molecular diversity indices (Intra-population analysis) but FAILS to compute the FST's and Number of shared haplotypes (Inter-population analysis). The log file states that there is not enough memory although this is unlikely to be the case for a number of reasons: a) the computer being used for the analysis has 128GB of RAM and over 10TB of hard drive space b) The amount of RAM allocated to each program for use is 128GB c) the sequences are only 244bp each in length.
I tried removing the structure in the .arp file to see whether this would help but it still failed. I then tried removing around 7000 sequences. This did work and the FST's computed.
The only thing would explain why the initial analysis did not work is that Arelquin has a limitation on the number of sequences/haplotypes that can be analysed at one time, although this is not stated in the Arlequin manual. The manual states 'the amount of data that Arlequin can handle mostly depends on the memory available on your computer'.
So I am wondering whether anyone has had problems like this before? and how did you overcome it?
Thanks in advance.