How would you break down Umbrella sampling to a newbie or to someone not from quantitative studies. Is there a good analogy to understand this? I am using gromacs for simulation. Opinions and insights will be appreciated. Thanks.
I'll make an attempt to provide a yet simpler view. The purpose of any molecular dynamics (MD) simulation is to sample (all) possible states in which a molecule of interest may exist. Based on this sampling, probability (free energy) for the molecule to be in any of these states can be readily calculated. Very (too) often, certain states of the protein are separated from others by extremely high energy barriers. Sometimes, it would take years, if not centuries, of conventional MD simulations to walk through all molecular states. Umbrella sampling (among other techniques) allows to accelerate the sampling by "flattening" those hills and ridges, which prevent MD from accessing certain states. In umbrella sampling, the energy landscape is "flattened" through adding artificial "umbrella" potentials that are supposed to "mirror", and thus annihilate, the real barriers. However, it would be difficult, if not impossible, to make an umbrella potential account for all degrees of freedom in the system (there are a few thousand of them for a typical protein, even if the solvent is neglected). Hence, the umbrella potential involves only a few (most often, one to three) degrees of freedom, often called collective variables or reaction coordinates. Sampling of a system is considered complete when it has "visited" all values of collective variables more than once (that is, a number of times required for an accurate and unbiased calculation of state probabilities).
Umbrella sampling is a very popular technique for potential of mean force (PMF) calculation to study protein binding-unbinding processes. Later, one can also extract the binding free energy from the obtained PMF. It basically enforces harmonic restraint by using biased potential with respect to reference molecule. The direction of such harmonic restraint is specified by the chosen collective variable (for example, distance, radius of gyration, angle etc.). The choice of collective variable holds the key here for accurate binding free energy calculation. The user should have some prior idea about the chemical process. If not, many trial calculations may be required to choose the proper collective variable.
For detailed information on how to do such calculation, you can refer to following link:
I agree with both Abdul and Biswajit. However, I would like to add some more points.
Umbrella sampling (US) in simple words is non-boltzmann sampling where we add an extra term to the energy for sampling to explore the free energy in collective variable (CV) space. In practice, CV space is divided into series of windows and an external potential, usually harmonic, is applied to keep the distribution peaked within a region of the order parameter. Now once the simulations are performed with the bias in each segment of the CV. These distributions are weighted and stitched to obtain unbiased probability distribution and associated free energy profile, which is the ultimate aim of performing umbrella sampling. Weighted histogram analysis is adopted to weigh the individual distributions and stitch them. Discussion regarding WHAM is available on http://membrane.urmc.rochester.edu/content/wham.
US can be used to study the events hindered by relatively large free energy barrier. You can also use PLUMED software to perform Umbrella Sampling in Gromacs.
(https://plumed.github.io/doc-v2.4/user-doc/html/belfast-4.html). As pointed out by Abdul, US is not easy to explain in few words. Hope it helps.
I'll make an attempt to provide a yet simpler view. The purpose of any molecular dynamics (MD) simulation is to sample (all) possible states in which a molecule of interest may exist. Based on this sampling, probability (free energy) for the molecule to be in any of these states can be readily calculated. Very (too) often, certain states of the protein are separated from others by extremely high energy barriers. Sometimes, it would take years, if not centuries, of conventional MD simulations to walk through all molecular states. Umbrella sampling (among other techniques) allows to accelerate the sampling by "flattening" those hills and ridges, which prevent MD from accessing certain states. In umbrella sampling, the energy landscape is "flattened" through adding artificial "umbrella" potentials that are supposed to "mirror", and thus annihilate, the real barriers. However, it would be difficult, if not impossible, to make an umbrella potential account for all degrees of freedom in the system (there are a few thousand of them for a typical protein, even if the solvent is neglected). Hence, the umbrella potential involves only a few (most often, one to three) degrees of freedom, often called collective variables or reaction coordinates. Sampling of a system is considered complete when it has "visited" all values of collective variables more than once (that is, a number of times required for an accurate and unbiased calculation of state probabilities).
Umbrella sampling can be understood as answers above in a view of potential energy surface (PES) or CV-space sampling, here's another way to understand it:
See this GROMACS tutorial, the umbrella sampling provides more arbitrariness to you on constructing the reaction path you suppose. Different configurations (or states, along the reaction path) can be generated by a "linear interpolation" method on the coordinates, but a more reliable way is to set them manually and that's the reason why I say the arbitrariness. Then different configurations located on different \lambdas (reaction coordinates, or collective variables) will be calculated (system will also sample around configurations respectively during the simulation) , and they'll be restrained by adding a harmonic potential into energy calculation (in real-space), i.e., the shape of distribution function of probabilities that system samples configurations is just like an umbrella (in CV-space, if energy is considered as well, then the PES view is actually regained).
More importantly, umbrella sampling is used for free energy calculation. During umbrella sampling, energies along the reaction path have been calculated.