Hi All,
I have:
A) a fasta file containing a few thousand short sequences (40bp each)
B) an 8Gb fastq.gz file containing millions of unassembled illumina paired end reads (100bp each)
I would like to blast each of the sequences in A against those in B and count the hits of each.
Does anyone know a good strategy for this? To use blastn I would need to convert the fastq.gz to fasta first in order to create a blast database, right? Is there any better way to do this, and any tools that might be computationally fast?
hope someone can advise!
cheers
David