I need to provide the number of haplotypes in my data for the hapFLK test (Fariello et al. 2013, Genetics).
I try to find the number of haplotypes in my whole-genome dog and wolf data (21Mb SNPs) with fastPHASE cross-validation procedure.
I created a test file (attached) and ran fastPHASE with different settings including:
./fastPHASE -T10 -KL3 -KU30 -Ki2 -Ks50 -Km1000 test.PHASE
However, fastPHASE always outputs K15, which I believe is incorrect because it never changes even if I change the input file. It also never outputs a file with _kselect extension, as it should. Maybe there is a bug. I use fastPHASE 1.4.8, it is available here http://scheet.org/software.html I tried to contact the author of the software, but have not got a response.
I also tried to use PHASE program, but it is extremely slow and cannot handle even a subset of 7K SNPs.
Could anyone share his/her experience with estimation of the number of haplotypes with fastPHASE or any other program?