PeiR is a lytic enzyme which is a methanogen virus that infects Methanobrevibacter ruminantium M1. Do all protease have active sites, if so how do I find it with the sequence only.
All proteases have an active site, since by definition that is where catalysis occurs.
To identify the likely active site residues, you should align the sequence of PeiR with the sequences of a closely related enzyme for which the active site is understood.
Here is a key paper on PeiR.
Article Inhibition of Rumen Methanogens by a Novel Archaeal Lytic En...
From this paper: "PeiR on the other hand has a catalytic domain belonging to the CA clan of the Peptidase_C39 PFAM family"
There is a database called MEROPS which contains information about proteases. Here is the page from that database on the C39 family of the CA clan.
Based on experience, proteins in the same family with at least 20% sequence identity have similar structures. The sequences might not align well, but the structures should at least in the catalytic region.
Get the sequence of PeiR lytic enzyme. It's probably on Uniprot or your know the sequence. I am guessing this is the one:
https://www.uniprot.org/uniprotkb/D3DZZ6/entry
AlphaFold it using Google Collab servers or if that was the one then you can see on the Uniprot page it was already AlphaFolded.
AlphaFold page for PeiR enzyme linked from Uniprot.
https://alphafold.ebi.ac.uk/entry/D3DZZ6
Download the PDB file and use it to fish for protein structure or protein complex files in the entire Protein Data bank. Upload the Alphafolded PDB file to to PDBeFold or the Dali Server:
PDBeFold: https://www.ebi.ac.uk/msd-srv/ssm/
Dali Server: http://ekhidna2.biocenter.helsinki.fi/dali/
If the sequence I found on Uniprot was correct then I already did the Dali search for you. Results will expire in one week.
Look for proteases or peptidases with a peptide ligand bound. If the predicted structure of your protein superimposes well with proteases on the Dali server and there is a ligand bound also from the crystal structures and the ligand is a peptide then find the accession code for the PDB file and go to the protein data bank and search for the accession number or PDB code.
Protein Data Bank:
https://www.rcsb.org/
Download that PDB file of the structural similar protease with the peptide bound. Open the PDB files in Pymol for the similar proteases with peptide bound and your alphafolded structure of PeiR peptidase. Structurally superimpose the structures with cealign command. It is superior to the align command.
I pretty much already found the active site because there were a lot of structurally similar peptidases with peptides bound in the Dali search.
I won't take the fun from you, though. You can take it from here.
Let me know if you have any questions. This is super fun for me. The answer is just a couple clicks away.
I think I found the active_site by the way except I used the "super" align command to align the Alphafolded structure of PeiR peptidase to 6mpz chain A.
Basically:
> super AF-D3DZZ6-F1-model_v4, 6mpz and chain A, object = alignment_to_6mpz
This was essentially the top ranked hit in the Dali search.
The peptide is in chain M of 6mpz PDB accecssion code. Using pymol's selection algebra, I found all residues within 5 angstroms of the peptide after the super alignment and labeled all the residues.
Basically: select active_site, byres all within 5 of 6mpz and chain M
I generated vacuum electrostatics for Alphafolded PeiR endopeptidase and set the transparency to 0.4 so you can see the surface representation that the peptide sits over as well as the active site residues.
Action > Generate > Vacuum electrostatics
I could have potentially made some kind of critical mistake as I don't know a lot about PeiR peptidase biochemistry, but this makes sense to me. It's up to you to evaluate it. AlphaFold isn't perfect, but it is the best protein structure prediction software available.
PDB information for 6MPZ.
https://www.rcsb.org/structure/6MPZ
Here is the paper associated with the crystal structure of PDB code with 6MPZ: https://elifesciences.org/articles/42305#content
The attached file is a Pymol session file.
You can align a few more structures now from the Dali or PDBeFold searches.