I am working with a plant species in which we discovered a very interesting satellite DNA. So I am interested to find out the organization of this satDNA in this species and to verify if we find any evidence of HOR structure.
You can find the most eficient algorithm for long HORs in Matko Gluncic and Vladimir Paar, Direct mapping of symbolic DNA sequence into frequency domain in global repeat map algorithm, Nucleic Acids Research 42,1 (2012) e47
You can take the software from web address quoted in that paper.
Alternatively, you can give me access to some of your sequenced data and we can make runs to search for large HORs. We would be interested for such colaboration.
Perhaps it would be simplest if you send me some of your sequences, we can quickly look for HORs or give me web address with sequence.
We are interested to look at plants for large HORs (and also for start/stop codon like trinucleotides in noncoding regions, see our papers Rosandic et al J Theor Biol (2013) and Gene (2013).
But can you detect that in short illumina reads? this would be a very nice advance on the characterization of tandem repeat arrays. I will send you an email with some informations about our findings. Greetings, André
We have proceed an Illumina paired-end sequencing with paired-end reads of 100 bp length only. We have about 40 Mio of reads for each of three species sequenced...
The most abundant tandem repeat, which seems to be centromere-specific, has a 171 bp monomer length.