We have sequenced the plant transcriptome using next generation sequencing. Now we have to Design SSR markers from this data. Please suggest how to design these markers.
To design SSRs markers, you need to identify the SSRs. Look for motifs. A perl script using regular expressions could help you pull out all the sequences that contain some common SSR motifs. Then, the marker design is a simple matter of designing primers to flank each repeat and testing them in a set of diverse genotypes.
There is another approach that could yield a lot of useful markers. If you are able to predict intron locations and their approximate lengths (BLAST your transcript sequences against your most closely-related genomic sequence and compare), you can easily find indels. Choose introns that are less than 500bp or so and design primers for the flanking regions. I designed many successful codominant markers in Brassicas using this method on EST data. Of course, I was fortunate to have the genomic sequence of very closely-related Arabidopsis to use for intron prediction. This method allows you to place markers at genes that may be of interest to you but lack SSRs. Best of luck!
To design SSRs markers, you need to identify the SSRs. Look for motifs. A perl script using regular expressions could help you pull out all the sequences that contain some common SSR motifs. Then, the marker design is a simple matter of designing primers to flank each repeat and testing them in a set of diverse genotypes.
There is another approach that could yield a lot of useful markers. If you are able to predict intron locations and their approximate lengths (BLAST your transcript sequences against your most closely-related genomic sequence and compare), you can easily find indels. Choose introns that are less than 500bp or so and design primers for the flanking regions. I designed many successful codominant markers in Brassicas using this method on EST data. Of course, I was fortunate to have the genomic sequence of very closely-related Arabidopsis to use for intron prediction. This method allows you to place markers at genes that may be of interest to you but lack SSRs. Best of luck!
Depending on what you need them for transcriptome data may not actually be the best for this since it consists almost entirely of coding data. If you're planning on using your SSRs for population genetics that could be a problem because they aren't neutral and probably won't be as variable - you're better off starting with genomic sequence. But sometimes you have to work with the data you have.
There are programs available that will search a given data set for SSRs - I've used msatcommander in the past, and it has the added advantage of being integrated with Primer3. Designing primers around your SSR once you've found them is another problem. The reads from Illumina and other 2nd Gen sequencers are usually too short - you'll find SSRs but you won't have enough flanking sequence to place primers. So for that you'll need assembled data. OTOH if your read lengths are long enough you can use raw data for this so 454 data works great. Even with minimal data (
For designing SSRs markers, first you need to identify the SSRs. A perl script programme may help you if you have less sequence then you use online version of MISA which directly gave SSR result file and go for blast and Insilco validation of SSR primer before testing on gel finally you can test a diverse genotype on PAGE . Best of luck!