Can anyone suggest how to get a list (identify/filter) of all human mitochondrial proteins containing WXXL motif in their sequence from human mitochondrial protein databases?
See this link: http://www.rcsb.org/pdb/staticHelp.do?p=help/advancedsearch/sequenceMotif.html.
Go to PDB http://www.rcsb.org/ and type WXXL in search box and use the Organism as Human and use refine with advance search for Mitochondrial proteins.
Another procedure that is applicable to a wider range of similar problems:
If you can export the contents of the database in FASTA format then one option is to modify the file to have each sequence entirely on one line (you can use the attached Perl script for that) and use something like 'grep -B1 W..L | grep "^>"' to get the headers with names of all the WXXL-containing sequences. If you add '| perl -i -p -e "s/.*(myIDpattern).*/\$1/"' it will actually cut out the protein IDs for you as well.