I need to input a protein sequence in a bioinformatic programme and it have to with FASTA format but with uppercase characters to denote exons and lowercase characters to denote introns. Someone knows how can I do it? Thanks
Just open fasta file with Microsoft word, select all letters which you want in Upper/lowercase. Next, click on Aa sign and select UPPERCASE for uppercase letters and lowercase for lowercase letters
Proteins do not have introns. The introns are in the genes, and the pre-spliced message RNA before it gets spliced to form the messenger RNA that will be translated into protein.
1. Using Word for changing case is not a good idea: it is not text processor, and if you will have to use the FASTA file later for anything but publication, you are likely to have problems.
2. I attach file in Biopython capitalizing your sequence starting from accession number (if you are connected of course). Long enough sequence of gene coding for human insulin receptor is used as an example, but you may feed any AccNo in command line if you wish
3. It is pdf file because Python code requires proper indents
4. will try to add the piece of code
5. It is not polished. It does not pretend to be a piece of programming art/ You warned.
If the software takes protein as an input, then it would not be asking for the introns and exons as they are just the parts of pre-mRNA. However, if the software is asking for the gene seq, then I will surely agree with the others for the Word.
I still disagree strongly with the suggestion of Word: it may be useful for some 400-500 bp max. If the sequence is longer, it is too easy to make a mistake. If the mistake is made, it is not easy to find it. In the script attached one may use genebank file represented by the Accession Number of the sequence and automatically transfers all features into fasta file. It even may work on a windows computer. By default if uses about 15kB long gene as an example. Please enjoy the script and stay away from Word while working with DNA sequences. Biopython (Bioperl, Bioruby etc) are so much better.