I am looking desperately for an application to analyze mouse exome data from NGS experiments. I tried SnpEff, but I was not very happy with the results.
Aamir is right with the basic workflow. The Genome Analysis Toolkit ("GATK") combines tools for most steps needed in NGS analysis into one package, and it is the one that has been used in most publications I have read.
You have to first clean the reads (Sequenced data) with SICKLE/FASTQC/FASTX toolkits. The cleaned data then should be aligned through bwa/bowtie2 onto the reference file of mouse genome. Downstream post processing of alignment files and variant calling through SAMtools would be helpful. Thanks.
Aamir is right with the basic workflow. The Genome Analysis Toolkit ("GATK") combines tools for most steps needed in NGS analysis into one package, and it is the one that has been used in most publications I have read.
Could you tell me a bit more about your analysis workflow? Did you recieve Raw fastq files or BAM files from your sequencing provider and you need to start from those or did you instead receive VCF files with variants that you want to filter?
If you are planning to process raw reads/Fastq output of the sequences, you will need an aligner to align your reads to the reference genome; the common options for this step are BWA and Bowtie. Following that, in order to process the raw BAM files, and make them ready for any kind of analysis, be it variant calling/copy number analysis, you need to perform a series of adjustments all implemented and documented well in the genome analysis toolkit (GAK) linked above. After that depending on the specific goals of your study, one has to make choice between an array of different tools available: Mutect vs VarScan2 vs ... for making somatic variant calls, GISTIC vs CNANorm, vs ... for identifying somatic copy number alterations etc.
If you are looking for a user friendly tool to perform variant annotation, and functional effect prediction, I recommend that you check out the cravat webserver at www.cravat.us. It is a completely free academic server which supports VCF format and provides annotations from a large number of databases.
Most of the tools that you need to analyze this type of data are available in GALAXY, but you need to know the steps ( fastq quality assurance, alignment, counting the reads, and then statistical analysis). Each of these steps can be performed using different softwares. Also you need to define your main objective (Differentially gene expression or transcripts (alternative splicing)) in order to define which pipeline you should employ.
Have you tried exomizer? (http://www.sanger.ac.uk/resources/databases/exomiser/)
"The Exomiser is a Java program that functionally annotates variants from whole-exome sequencing data starting from a VCF file (version 4). The functional annotation code is based on Annovar and uses UCSC KnownGene transcript definitions and hg19 genomic coordinates.
Variants are prioritised according to user-defined criteria on variant frequency, pathogenicity, quality, inheritance pattern, and model organism phenotype data. Predicted pathogenicity data was extracted from the dbNSFP resource. Cross-species phenotype comparisons come from our PhenoDigm tool powered by the OWLSim algorithm.
The Exomiser was developed by the Computational Biology and Bioinformatics group at the Institute for Medical Genetics and Human Genetics of the Charité - Universitätsmedizin Berlin, the Mouse Informatics Group at the Sanger Institute and the Lewis group at the Lawrence Berkeley National Labs."