Blast is about to find out non-chance similarities between biological sequences. Can anyone explain non-chance similarities between biological sequences?
blast is an algorithm that divides your query sequence into K-mer ( small subsequence from your sequence), then search with this fragments against databases, once it gets a similar sequence it starts to extend the pairwise alignments, insert gaps and calculate statsistics. the most important values are the similarity precentage(the higher value the more similar sequences) and the E-value which represents the probability of being this similarity generated by chance ( the higher E-value the more chance to get similarity by chance,) the best values are Zero or near Zero.