02 February 2015 9 4K Report

Hi All,

I have:

A) a fasta file containing a few thousand short sequences (40bp each)

B) an 8Gb fastq.gz file containing millions of unassembled illumina paired end reads (100bp each)

I would like to blast each of the sequences in A against those in B and count the hits of each.

Does anyone know a good strategy for this? To use blastn I would need to convert the fastq.gz to fasta first in order to create a blast database, right? Is there any better way to do this, and any tools that might be computationally fast?

hope someone can advise!

cheers

David

Similar questions and discussions