I might be wrong, but personally I do not think you can. You can generate a new .tpr file pointing at the last frame of the trajectory, and continue from there, although that is not going to be an exact continuation. For a proper continuation, that is, the -cpi flag of mdrun, you need a .cpt file. I do not think you can generate/recover it from the other files are from my understanding the .cpt is a binary that keeps the last velocities, forces, barostat/thermostat states etc.
I hope I am wrong and someone comes out with a solution, as sometimes I also need to continue from a corrupted .cpt.
Nicola Piasentin Thanks a lot for your suggestion..I totally agree with your remarks..Could you please how can I generate a new tpr file using the last frame of the trajectory? Shoul I dump or is there any other way to work around? I have a very large system..
Nicola Piasentin Thanks a lot for your time and input. The previous MD run which stopped abruptly was an extended version of a previous MD run. So I bleieve I need to change the
init-step ... in the mdp file for generating the new tpr file before running the production MD? Kindly provide your feedback.
You can try this out. However, I don't know if this will work.
1. You must ensure that you have the necessary input files of your previous simulation, which include the topology file (.top), the last coordinate file (.gro), and the trajectory file (.xtc).
2. By using the the gmx grompp command, prepare a new .tpr file. Specify the last frame of your trajectory as a starting point.
From your question, What I understand is that while running an md production of 1ns starting from 0, due to certain reasons, md production is stopped. In this situation, xtc, edr, log and cpt files is generated except the gro file.
gmx mdrun -ntmpi 12 -ntomp 6 -pin on -v -nb gpu -pme cpu -tunepme yes -pmefft cpu -bonded cpu -update cpu -deffnm md_0_1
Generally, while resuming md production (proper continuation), we must provide cpt file to gmx mdrun command.
Syntex for this is:
gmx mdrun -ntmpi 12 -ntomp 6 -pin on -v -nb gpu -pme cpu -tunepme yes -pmefft cpu -bonded cpu -update cpu -deffnm md_0_1 -cpi md_0_1.cpt -s md_0_1.tpr -append
The .cpt file contains information such as the last frame’s atomic coordinates, velocities, box dimensions, simulation parameters, energies, and other relevant data. Therefore, as per Nicola Piasentin, no other solution for the exact or proper continuation of md production without providing cpt file to mdrun syntax.
However, I tried a solution and succeeded in this alternative approach.
Here are the steps:
1. Find out the time (ps) and step of the last frame generated by looking in the log file.
Ex.
Step Time
95000 190.00000
2. Extract the last frame from xtc file and save it in trr and gro format.
gmx trjconv -f md_0_1.xtc -s md_0_1.tpr -o last_frame.gro-dump 190 -tu ps #Select group for output: 0 (System)
gmx trjconv -f md_0_1.xtc -s md_0_1.tpr -o last_frame.trr -dump 190 -tu ps #Select group for output: 0 (System)
3. Change the value of nsteps in md.mdp file. Minus the completed step from total step. Save mdp file with new name (md_revsied.mdp)
5. Run the md production. But this run starts from 0 for the remaining step.
gmx mdrun -ntmpi 12 -ntomp 6 -pin on -v -nb gpu -pme cpu -tunepme yes -pmefft cpu -bonded cpu -update cpu -deffnm md_190_1
6. Concatenate the two XTC files into a single trajectory file using the “-settime” option. The "-settime" option in the gmx trjcat command allows you to interactively specify the start time for each input trajectory file. When concatenating multiple trajectory files, this option prompts you to provide the start time for each file individually. Doing so ensures that the concatenated trajectory maintains a continuous time progression.
Here’s how it works:
When you use "-settime" " you’ll be prompted to enter the start time for the first file, then the second file (I used c option in both case), and so on.
c (continue) - The start time is taken from the end of the previous file. Use it when your continuation run restarts with t=0.
l (last) - The time in this file will be changed the same amount as in the previous. Use it when the time in the new run continues from the end of the previous one, since this takes possible overlap into account.
The program uses these specified start times to create a seamless concatenated trajectory.
Mahendra Gaur yes, in this case you will have a trajectory of the length you want. However, it still won't be an exact continuation, as you are not linking the .cpt file and you lose information about barostat/termostat/velocities/forces etc.
Moreover, pay attention at the log file step. I am pretty sure that the log file and the trajectory file may not be written at the same time, e.g. the log file can be written every 1 ps and the trajectory every 10 ps. You can check the output frequency in your .mdp file. As such, if you read the last time in the .log file there is a (good) probability that it won't coincide with the time of the last frame of the trajectory. I think here it would be better to check the time the last frame was written directly from the trajectory file, you should be able to do so with tools like gmx check. If you output as a gro file the last frame you may be able to read the time also from the first descriptive line in the .gro file itself.
Nicola Piasentin, Thanks a lot for your quick response and guidance.
Yes, you are right. Since we are not providing a .cpt file to gmx mdrun, we have lost information about barostat/thermostat/velocities/forces, etc.
However, the -t option in GROMACS’ gmx grompp command is used to specify a trajectory file (such as .trr or .cpt). It allows us to restart simulations by reading the last frame with coordinates and barostat/thermostat/velocities/forces from the specified trajectory file. When we provide a trajectory file with -t, gmx grompp will use the coordinates and velocities from the last frame of this file unless the -time option is used to specify a different time frame. This is particularly useful for preserving simulation continuity, ensuring that the simulation picks up exactly where it left off.
Yes, we should check the time of the last frame from the .xtc file.
gmx check -f md_0_1.xtc
Last frame 19 time 190.000
In both cases, the time of the last frame in the .log and .xtc files is identical in my case because, in .mdp file for mdrun, I have set nstlog and nstxout-compressed value to every 10.0 ps.
For this, we must ensure that the log file and trajectory file are written at the same frequency to avoid discrepancies.
For further confirmation that using the -t option in GROMACS’ gmx grompp command, no loss of barostat/thermostat/velocities/forces, I checked RMSD of full traj and combined traj. Plz find image file. In both cases, the trend of RMSD is exact same.
So now, if we don't have the .cpt file and want to resume terminated md production, we still have an alternative approach to resume md production.
How do the two trajectories differ? Are they the same just split at a certain point or did you restart and these are two separate runs?
I would still be careful with this approach. Restarting the velocities etc might not be a problem, especially if the system is at equilibrium, since we are going to stay more or less in the same ball park. At the end of the day, usually we are sampling macrostates, and microstates details - as long as the distribution is the same - should be irrelevant when taken alone.
However, again, the GROMACS manual says that a .cpt file is required to have a precise continuation. For example, I am pretty sure that the .xtc file contains only the positions, not even in full precision. So in your passage, when you are extracting the last frame of the .xtc and making it a .trr, that's just an extension change, but you are not recovering neither the full precision positions nor the velocities, are these are not stored in the .xtc. You can change the shape, but the not the content of the file. I guess that the velocities will be generated on the spot or will just be zero or something like that. Also you have to generate a new .tpr and then concatenate the trajectories, which you do not have to do with the .cpt as the memory of where the simulation stops is fully conserved.
I might be wrong, as this is really about technical details, and again, if you are around equilibrium it shouldn't be a huge problem to resample the velocities from a Boltzmann distribution at the temperature of the system, but I would be very careful when considering the .trr as the proper .cpt run extension.