I would like to generate protein motifs/profiles/regular-expressions from multiple sequence alignments. Can anybody please suggest a tool for such? Preferably online and not command line.
Sounds like Seq2Logo would be suitable for your objectives (see link). This tool facilitates production of a sequence logo and corresponding motif/profile given a DNA or protein multiple sequence alignment.
Seq2Logo is ok for making nice logos, but I don't think it allows you to get the motif in text form e.g. [LGM]-[WALV]-W-[QALV], which is what i am trying to achieve.
I did find one online tool: PRATT 2.1 (link below), which does what I need it to do; however, it has an upper limit on the amount of sequences I can upload, and I dont want such constraints. I need to create a motif based upon an alignment of nearly 600 sequences.
Should you have commandline access and maybe a friend with a little knowledge about it, then this little script will do exactly what you're asking for:
Copy/paste into a file called mkMotif.pl and run it from the commandline like so:
cat mySeqFile.fsa | perl mkMotif.pl | less
I just wrote the script swiftly, so it assumes that the sequences are aligned, i.e. of same length, and that there are no multiline sequences in your FASTA file, i.e.
>seq1
ASD
DFG
RFG
Will NOT work!
Have fun and if I may be so bold, consider a basic programming class e.g. in perl or python, it will give you SO MUCH in return - No pun intended! :-)
Thank you for the code – it is much appreciated! I can actually code in Perl and Java, but thanks to long periods of inactivity my skills are very ropey at the moment; moreover, the computer I have access to doesn’t have either programming language installed on it, and I was looking for a quick fix. Your code will definitely come in handy for future endeavors. Thanks again.
Hi Ruhshan,
This seems to be what I was looking for – thanks. I am currently running it (bit slower than I was expecting, but I can’t complain :) ), and will have a proper check of its suitability for my needs when complete.
Thank you to all of you for you helpful advice/suggestions – Parsa, Leon, Ruhshan. Good luck with your research.