I am trying to identify the number and locations of a repeated sequence in Eukaryota's genome.kindly suggest me some genome viewer tools/software other then CLC-Bio.
Assuming you are comfortable working in a Unix environment, RepeatMasker (www.repeatmasker.org) is a good option. It's designed for masking all the repeats in a genome, but if there is a specific one that you are interested in, you can specify a 'repeat database' containing only that sequence and it will just search for that (and close variants). It's output includes both overall statistics (copy number etc) and the locations of every repeat.
As I said it needs Unix/Linux and also requires several other free software packages (Perl, some form of BLAST, etc.) to be installed on your system, so it's worth getting your system administrator to do it.
Assuming you are comfortable working in a Unix environment, RepeatMasker (www.repeatmasker.org) is a good option. It's designed for masking all the repeats in a genome, but if there is a specific one that you are interested in, you can specify a 'repeat database' containing only that sequence and it will just search for that (and close variants). It's output includes both overall statistics (copy number etc) and the locations of every repeat.
As I said it needs Unix/Linux and also requires several other free software packages (Perl, some form of BLAST, etc.) to be installed on your system, so it's worth getting your system administrator to do it.
It can be run through their website or locally. It gives you a table of the repeats, their positions and number of repeats.
Please note that if you obtained your genome through a short read NGS technique (illumina for instance), the number of repeats is not reliable if the total repeat region is longer than the read length used to sequence the genome.
@Stuart_J_Lucas Sir, I am using windows. so I need windows software.
@Matej_Lexa Sir. Like genome of Arabidopsis thaliana which is availabel at TAIR databases. I have a complete genome sequence of Chickpea and now I want to search a sequence e.g. NNNNN (8 to 12 Nucleoside) in this genome to find the total number of repeats and its interval/position in this genome. for this purpose I need a software/tool.
If you need to identify short sequence patterns, you could even use grep or agrep, a bit tricky under Windows, but there seem to be equivalents (see e.g. http://stackoverflow.com/questions/87350/what-are-good-grep-tools-for-windows).
Even a better option would be to install R and Bioconductor, which is designed to run under Windows and use its excellent Biostrings package and matchPattern() function (see e.g. https://www.bioconductor.org/packages/devel/bioc/vignettes/BSgenome/inst/doc/GenomeSearching.pdf). The advantage of this approach is that the results are in a format (GRanges) that can be readily processed by other analysis and visualization components of Bioconductor. Or exported into GFF3. R also has grep for strings.
Finally, if the search needs to be really fast, approaches used in sequencing read mapping software, or the software directly could be used. I myself wrote software for that (see link below)
Article PRIMEX: Rapid identification of oligonucleotide matches in w...
I was wondering if it is possible to find the contribution of an endogenous viral DNA in a host genome. The host genome has contigs only and no complete chromosomal assignment is available. I tried with RepeatMasker web server, but unfortunately it didn't run for me. I would appreciate any advice for this.