Hi, I'm working on sequence alignment algorithms. My background is Computer Science. Given two sequences, what could be the max length of a gap and how many insertions/deletion at one stretch I may consider? I think more than one insertion/deletion at one stretch is useless...? My algorithm will accept a large text file, and report locations in the file where the best regions are found.
For example:
Query: ATCGACTAACCA
File: TCAGCTTCCAGCTA
When I executed these two strings on ebi.ac.uk, I got the following result, 7 pairs match.
EMBOSS_001 A T C G A C T A A C C A ---- EMBOSS_001 - T C A G C T - T C C A G C T A
However, my algorithm reports an 8 pairs match, which one is better? Please suggest. Many Thanks in advance...
Query A T C G A C T A A C C A File: - T C A G C T T C C A G C T A