I have thousands of sequences and I want to find SNPs in two different regions of fungi genome. Which is the best software used and does anyone know a free software for this analysis?
I used bwa and samtools to create a VCF file and wrote a script to create a FASTA file of only the variant sites from the VCF file. Here's the pipeline and scripts:
For such analysis I use http://www.mbio.ncsu.edu/bioedit/bioedit.html but I think this is normally good for alignments from 1-10 sequences, maybe 20. To compare thousands of sequences I don't know freeware, maybe:
http://www.snpator.org/public/new_login/index.php
Not Freeware but it sounds good and you can order a free trial: http://www.dnastar.com/t-sub-nextgen-genome-solutions-snp-analysis.aspx
If you are familiar to ubuntu you could use the Sam tools software (with command line but free). CLC and DNAstar are useful but expensive (may you try the trial version for the SNP calling)
You could easily design a pipeline to detect SNPs using nucmer of mummer. I believe that you are familiar with bash scripting. A bash scripts running the following commands, nucmer, delta-filter and show-snps within a loop will give you a good start and then strengthening the script to parse and analyze the findings not only give you results you expect, but a pipeline one could improve further.
If you are having reads from Next Generation Sequencing (NGS), then there are number of tools capable of profiling SNPs and indels effectively and efficiently.
As Visam suggested mira is one of the best at the moment. If you already have mapping results in (BWA is a very good mapping tool) sam, bam or bed format SAMTools can be effectively used to profile SNPs and indels. You could also use BEDTools. I have a small pipeline created for this and I could try to dig and unearth it (since I created it some where back in 2010) if you want a stating point.
I used bwa and samtools to create a VCF file and wrote a script to create a FASTA file of only the variant sites from the VCF file. Here's the pipeline and scripts:
Being a NGS beginner I recommend to use a tools implemented within the CLC genomics workpackage from CLC bio. The users friendly program returns and nice Table of SNPs, (both the substitutions and indels) with calculated frequencies. The only limitation is the cost of the software :-(.