What is the best way to carry out iterative / homology model of PZ?

09 September 2019 7 9K Report

I have a very specific problem while modeling PZ (Protein Z). The protein is a molecule of 360 residues with a Gla domain (1-46), two EGFs (47-126) and an SP like domain (135-360). When looked for similarity, some PDBs matched 144-360 residues of PZ with high coverage. However, there are no PDBs that corresponds to the GLA domain (N terminal) and subsequently the whole protein. Now, my study involves the N terminal domain, so I need to model the complete protein.

There is only 1 PDB available (1DAN in RSCB); that has 4 chains in the molecule. But, whenever SWISS modeller is being used, it is only taking a single chain. Is it possible to do in Swiss modeler?

What I have seen, that the PZ is having a high sequence alignment (covering the whole PZ) with 1DAN only when the 2 chains of 1DAN is taken and reorganized (Light chain is added to the heavy chain).

Now, my question is how to model this protein? If MODELLER (developed by Andrej Sali) can be used, then which technique should be followed?

I am attaching the sequences:

(1) PZ (2) 1DAN (3) Sequence alignment of PZ and 1DAN.

Thanks.

1. Sequence of PZ:

>PZ

AGSYLLEELFEGNLEKECYEEICVYEEAREVFENEVVTDEF

WRRYKGGSPCISQPCLHNGSCQDSIWGYTCTCSPGYEGSNCELAKNECHPERTDGCQHFC

LPGQESYTCSCAQGYRLGEDHKQCVPHDQCACGVLTSEKRAPDLQDLPWQVKLTNSEGKD

FCGGVIIRENFVLTTAKCSLLHRNITVKTYFNRTSQDPLMIKITHVHVHMRYDADAGEND

LSLLELEWPIQCPGAGLPVCTPEKDFAEHLLIPRTRGLLSGWARNGTDLGNSLTTRPVTL

VEGEECGQVLNVTVTTRTYCERSSVAAMHWMDGSVVTREHRGSWFLTGVLGSQPVGGQAH

MVLVTKVSRYSLWFKQIMN

2. Sequence of 1DAN:

>1DAN:H|PDBID|CHAIN|SEQUENCE

IVGGKVCPKGECPWQVLLLVNGAQLCGGTLINTIWVVSAAHCFDKIKNWRNLIAVLGEHDLSEHDGDEQSRRVAQVIIPS

TYVPGTTNHDIALLRLHQPVVLTDHVVPLCLPERTFSERTLAFVRFSLVSGWGQLLDRGATALELMVLNVPRLMTQDCLQ

QSRKVGDSPNITEYMFCAGYSDGSKDSCKGDSGGPHATHYRGTWYLTGIVSWGQGCATVGHFGVYTRVSQYIEWLQKLMR

SEPRPGVLLRAPFP

>1DAN:L|PDBID|CHAIN|SEQUENCE

ANAFLEELRPGSLERECKEEQCSFEEAREIFKDAERTKLFWISYSDGDQCASSPCQNGGSCKDQLQSYICFCLPAFEGRN

CETHKDDQLICVNENGGCEQYCSDHTGTKRSCRCHEGYSLLADGVSCTPTVEYPCGKIPILEKRNASKPQGR

>1DAN:T|PDBID|CHAIN|SEQUENCE

NTVAAYNLTWKSTNFKTILEWEPKPVNQVYTVQISTKSGDWKSKCFYTTDTECDLTDEIVKDVKQTYLARVFSYPAGNVE

>1DAN:U|PDBID|CHAIN|SEQUENCE

GEPLYENSPEFTPYLETNLGQPTIQSFEQVGTKVNVTVEDERTLVRRNNTFLSLRDVFGKDLIYTLYYWKSSSSGKKTAK

TNTNEFLIDVDKGENYCFSVQAVIPSRTVNRKSTDSPVECM

3. The sequence alignment created:

PZ AGSYLLEELFEGNLEKECYEEICVYEEAREVFENEVVTDEFWRRYKGGSPCISQPCLHNG 60 1DAN -ANAFLEELRPGSLERECKEEQCSFEEAREIFKDAERTKLFWISYSDGDQCASSPCQNGG 59 .. :**** *.**:** ** * :*****:*:: *. ** *..*. * *.** :.* PZ SCQDSIWGYTCTCSPGYEGSNCELAKNECH--PERTDGCQHFCLPGQ-ESYTCSCAQGYR 117 1DAN SCKDQLQSYICFCLPAFEGRNCETHKDDQLICVNENGGCEQYCSDHTGTKRSCRCHEGYS 119 **:*.: .* * * *.:** *** *:: :...**:::* . :* * :** PZ LGEDHKQCVPHDQCACGVLTSEKRA--------------PDLQDLPWQVKLTNSEGKDFC 163 1DAN LLADGVSCTPTVEYPCGKIPILEKRNASKPQGRIVGGKVCPKGECPWQVLLL-VNGAQLC 178 * * .*.* : ** : :: : **** * :* ::* PZ GGVIIRENFVLTTAKCSLLHRNITVK------TYFNRTSQDPLMIKITHVHVHMRYDADA 217 1DAN GGTLINTIWVVSAAHCFDKIKNWRNLIAVLGEHDLSEHDGDEQSRRVAQVIIPSTYVPGT 238 **.:*. :*:::*:* :* :.. . * ::::* : * .: PZ GENDLSLLELEWPIQCPGAGLPVCTPEKDFAEHLLIPRTRGLLSGWARNGTDLGNSLTTR 277 1DAN TNHDIALLRLHQPVVLTDHVVPLCLPERTFSERTLAFVRFSLVSGWGQLLDRGATALELM 298 ::*::**.*. *: . :*:* **: *:*: * .*:***.: ..:* PZ --PVTLVEGEECGQVLN-----VTVTTRTYCERSSVA---AMHWMDGSVVTREHRGSWFL 327 1DAN VLNVPRLMTQDCLQQSRKVGDSPNITEYMFCAGYSDGSKDSCKGDSGGPHATHYRGTWYL 358 * : ::* * . .:* :* * . : : .*. : .:**:*:* PZ TGVLGSQ-PVGGQAHMVLVTKVSRYSLWFKQIMN-------------- 360 1DAN TGIVSWGQGCATVGHFGVYTRVSQYIEWLQKLMRSEPRPGVLLRAPFP 406 **::. . .*: : *:**:* *::::*.

Annemarie Honegger Popular answer

I would first model each domain separately using the best method based on the level of sequence homology, (homology modeling for good similarity, threading for real but low similarity, and for the domain for which you could not find a template, ab-initio modeling, e.g. with rosetta), then arrange the relative orientation of the individual domain model the best you can - docking or leaving space for unfolded linker segment. You can then combine these domains in a reasonable relative orientation into one pdb file and use this pdb file as a template for homology modeling the full length of the protein.

Pankaj Kumar Singh

Hi Soumyadev,

The problem which you have put forth is a very frequent one, I would suggest you to go for Multiple template modelling approach if you choose to use MODELLER, to make your work easy try installing easymodeller.

Alternatively, you can use ITASSER as well, which is one of the best protein modelling servers available.

Hope this helps,

Cheers,

Pankaj

Annemarie Honegger

Soumyadev Sarkar

Annemarie Honegger Thank you for the idea. I will try this. Let's see how this works out.

Soumyadev Sarkar

Pankaj Kumar Singh Easymodeller helped! Thank you.

Sudip Kumar Dutta

Hi Soumya,

I usually determine the secondary structure of the query protein based on amino acid sequence (u can use PSIPRED or other web servers), then try to find out templates based on secondary structure. Use the template/s for modeling (using MODELLER), where you can specify the chain that you wish to use for modeling, followed by energy minimization of the model generated.

Soumyadev Sarkar

Sudip Kumar Dutta Thanks Sudip, I would like to tell you that I have managed to generate a good structure of the protein. Thanks for your input. Much appreciated.

How to calculate percentage (%) or concentration (mM) of nitrogen used in media?

How can I prepare virus for a TEM or SEM imaging?

How to learn more about SPSS and its Application?

Can I base on reverse DNA sequences to perform alignment, convert to amino acids and GenBank submission?

Baseline drift in HPLC? What causes this?

Text-Communication from the M1 Hand Area using BCI—and then there is Elon Musk?

Handling Missing Data and Building a Predictive Model with Incomplete Information ?

Has anyone applied Python in the field of textile engineering for data analysis, automation, or smart textiles?

Is it possible to use the Fused Deposition Modeling (FDM) to additively manufacture interconnected porous structure generation of >100-200 micrometer?

How can I use the cif data obtained from rietveld refinement extracted via gsas2, for microstructural analysis using ETEX software?

How to define an anisotropic material with asymmetric elastic compliance/stiffness matrix in ANSYS APDL?