Is there a quick way to retrieve proteins sequences associated with the E.C. number 3.1.1 from any of the protein databases (without database scraping)? How can I do this?
it is a table with EC number and the seq info. I think if you just want to momentarily query one or a few genes you can quickly look them up there.
If I need to incorporate this retrieval in my major functions, I will use biomaRt to get the corresponding Uniprot accessions for the enzymes and match the IDs to a fasta file to get the sequences.
At the EXPASY website (https://enzyme.expasy.org/) you have direct access to the "Enzyme" database. You can search it by EC number. it contains for each enzyme the official enzyme name, links to other databases such as PROSITE, KREGG, Medline, metacyc etc and direct links to all sequences with that EC number as available in the highly curated SwissProt database.The entire database is downloadable for offline use as well.
To download all sequences in UniProt with E.C number 3.1:
i. Go to UniProtKB website: https://www.uniprot.org/
ii. In the search box, click on Advanced
iii. Use the dialog boxes to select: “Function > Enzyme classification EC”
iv. In term enter 3.1.*
v. Click on search
vi. Edit the tabular results to show EC numbers by clicking on the edit button at the far right side of the table and then selecting EC number under Function.
vii. You can download the results in tabular form as a tsv or Excel file.