I am not aware of a standard or straightforward way for doing so.
My suggestion is to do something like this:
- Add the missing residues to your system using standard modeling software (e.g. Sybyl) connecting only one side of the loop sequence to the protein first.
- Use distance constraints within modeling software to bring the still open side of the loop and the protein close together (minimization or short molecular dynamics simulation). Alternatively, pymol offered the possiblity of 'sculpting', i.e. dragging your protein chain manually like a rubber band.
During this closure process, the rest of the protein should be fixed.
- Connect the loop terminus with the protein (i.e. change topology files or residue names). Now you have a first model with a loop.
- Perform a standard higher temperature MD or simulated annealing simulation for the loop, still with fixed positions of the rest of the protein. Do this in order to relax the loop.
- Check that the final loop structure did not form any secondary structure elements and has no too strong interactions with the rest of the protein, but extends into solvent and shows distinct flexibility.
Of course, if the loop contains the binding site of your protein, things get complicated. However, the question is, whether you are interested in this modelled loop region e.g. to perform docking studies or if you just need a full-length protein for MD-simulation.
If the loop region is missing in your structure (an experimental one, I suppose) this normally means that the loop is flexible or unstructured.
You certainly know, that a folded structure is no prerequisite for protein-protein binding; indeed, many protein-protein interactions are mediated via small linear interaction motifs (SLIM). Unless you have experimental evidence that your structure is folded you should think of something like linear interaction.
On the other site, if you know, that your structure adopts a specific fold upon binding, then you can model this substructure first and then insert it into your protein with a similar method as I suggested above.
You can try to build up the loop by Schrödinger and after that to use aMD instead of classical MD approaches. To accelerate your calculations you can use the GPU version implemented in Amber 12. Run at least 5 independent simulations (70-100ns each) and you will get idea about the real folding of your loop. I was able to reproduce correct folding and position of a helix starting from RMSD about 30-40A. This was actually impossible by "classical" methods.
Based on your image it seem the loop is facing the solvent. Most probably it is flexible and unstructured. You may not want to model it if that is the case. It may become ordered upon interaction with its partner.
Check my paper: http://pubs.acs.org/doi/abs/10.1021/ct300083m.
I've modeled an unmapped linker in P-gp (56 aa) from secondary structure predicton online servers and built it on MOE accordingly. Then, performed a short MD on MOE in implicit solvent to equilibrate (100 ps).
Afterwards, follow these steps:
1) Create 8-10 independent MD simulations (i've used GROMACS) and change the initial parameters (temp, press, algorithm, cutoff) and run in NVT conditions for 100 ps;
2) Extend these runs in exactly identical conditions (NPT ensemble) for 20 ns;
3) Evaluate the total energy, Ramachandran plot, etc.
Thank you. Some unexpected approach. I think that usually try to lower those above. Does it really work well? At first glance I do not see why it's better then to just raise the temperature.
I took a protein pdb1OPD, froze all atoms except the loop of 18 residues on the surface.
These I untwist using MD with rigid restrains. And then try to fold back by Replica Exchange MD. 3 of the 8 replicas reached RMSD 2-2.5 angstroms during 3 ns simulation. It took just over a week on a processor with 4 cores. Thus the problem looks solvable by MD.
conditions:
AMBER99SB
8 replicas in the temperature range of 273.15 - 370.00 K
GB model as implemented in Abalone (http://www.biomolecular-modeling.com/Abalone/index.html)