Useful mdrun features¶
This section discusses features in gmx mdrun that don’t fit well elsewhere.
Re-running a simulation¶
The rerun feature allows you to take any trajectory file traj.trr
and compute quantities based upon the coordinates in that file using
the model physics supplied in the topol.tpr
file. It can be used
with command lines like mdrun -s topol -rerun traj.trr
. That tpr
could be different from the one that generated the trajectory. This
can be used to compute the energy or forces for exactly the
coordinates supplied as input, or to extract quantities based on
subsets of the molecular system (see gmx convert-tpr and
gmx trjconv). It is easier to do a correct “single-point” energy
evaluation with this feature than a 0-step simulation.
Neighbor searching is normally performed for every frame in the
trajectory, since gmx mdrun can no longer assume anything about how the
structures were generated. If nstlist
is zero, then only one
neighbor list will be constructed. Naturally, no update or constraint
algorithms are ever used.
Running a simulation in reproducible mode¶
It is generally difficult to run an efficient parallel MD simulation
that is based primarily on floating-point arithmetic and is fully
reproducible. By default, gmx mdrun will observe how things are going
and vary how the simulation is conducted in order to optimize
throughput. However, there is a “reproducible mode” available with
mdrun -reprod
that will systematically eliminate all sources of
variation within that run; repeated invocations on the same input and
hardware will be binary identical. However, running in this mode on
different hardware, or with a different compiler, etc. will not be
reproducible. This should normally only be used when investigating
possible problems.
Running multi-simulations¶
There are numerous situations where running a related set of simulations within the same invocation of mdrun are necessary or useful. Running a replica-exchange simulation requires it, as do simulations using ensemble-based distance or orientation restraints. Running a related series of lambda points for a free-energy computation is also convenient to do this way.
This feature requires
configuring GROMACS with an external MPI library
so that the set of
simulations can communicate. The n
simulations within the set can
use internal MPI parallelism also, so that mpirun -np x mdrun_mpi
for x
a multiple of n
will use x/n
ranks per simulation.
There are two ways of organizing files when running such
simulations. All of the normal mechanisms work in either case,
including -deffnm
.
-multidir
- You must create a set of
n
directories for then
simulations, place all the relevant input files in those directories (e.g. namedtopol.tpr
), and run withmpirun -np x gmx_mpi mdrun -s topol -multidir <names-of-directories>
. If the order of the simulations within the multi-simulation is significant, then you are responsible for ordering their names when you provide them to-multidir
. Be careful with shells that do filename globbing dictionary-style, e.g.dir1 dir10 dir11 ... dir2 ...
. This option is generally the most convenient to use.gmx mdrun -table
for the group cutoff-scheme works only in this mode. -multi
- You must organize that the filenames for each simulation in a set of
n
simulations have an integer0
throughn-1
appended to the filename (e.g.topol2.tpr
), and run withmpirun -np x gmx mdrun -multi n -s input
. The order of simulations within the set is determined by the integer appended.
Examples running multi-simulations¶
mpirun -np 32 gmx_mpi mdrun -multi
Starts a multi-simulation on 32 ranks with as many simulations n
as
there are files named topol*.tpr
for integers 0
to n-1
. Other
input and output files are suffixed similarly.
mpirun -np 32 gmx_mpi mdrun -multidir a b c d
Starts a multi-simulation on 32 ranks with 4 simulations. The input
and output files are found in directories a
, b
, c
, and d
.
mpirun -np 32 gmx_mpi mdrun -multidir a b c d -gpu_id 0000000011111111
Starts the same multi-simulation as before. On a machine with two
physical nodes and two GPUs per node, there will be 16 MPI ranks per
node, and 8 MPI ranks per simulation. The 16 MPI ranks doing PP work
on a node are mapped to the GPUs with IDs 0 and 1, even though they
come from more than one simulation. They are mapped in the order
indicated, so that the PP ranks from each simulation use a single
GPU. However, the order 0101010101010101
could run faster.
Running replica-exchange simulations¶
When running a multi-simulation, using gmx mdrun -replex n
means that a
replica exchange is attempted every given number of steps. The number
of replicas is set with the -multi
or -multidir
option, described
above. All run input files should use a different value for the
coupling parameter (e.g. temperature), which ascends over the set of
input files. The random seed for replica exchange is set with
-reseed
. After every exchange, the velocities are scaled and
neighbor searching is performed. See the Reference Manual for more
details on how replica exchange functions in GROMACS.
Controlling the length of the simulation¶
Normally, the length of an MD simulation is best managed through the [.mdp] option [nsteps](#nsteps), however there are situations where more control is useful. gmx mdrun -nsteps 100 overrides the [.mdp] file and executes 100 steps. gmx mdrun -maxh 2.5 will terminate the simulation shortly before 2.5 hours elapse, which can be useful when running under cluster queues (as long as the queuing system does not ever suspend the simulation).
Running a membrane protein embedding simulation¶
This is a module to help embed a membrane protein into an equilibrated lipid bilayer at a position and orientation specified by the user.
This method was initially described as a ProtSqueeze technique
(Yesylevskyy S.O., J Chem Inf Model 47(5) (2007) 1986-94) and
later implemented in GROMACS as g_membed tool (Wolf et al, J Comp Chem 31 (2010) 2169-2174).
Currently the functionality of g_membed is available in mdrun if
-membed
option is specified (see below).
The main advantage is that it is possible to use very complex lipid bilayers with a number of different components that have been relaxed for a long time in a previous simulation. In theory that could be accomplished with a procedure similar to gmx solvate, but since lipids are much larger than water molecules it will lead to a large vacuum layer between the protein and membrane if we remove all molecules where any atom is overlapping. Instead, this module works by first artificially shrinking the protein in the xy-plane, then it removes lipids that overlap with a much smaller core, after which we gradually push the protein atoms back to their initial positions, while using normal dynamics for the rest of the system so lipids adapt to the protein.
To use membrane embedding, start by building a lipid bilayer that is just-so-slightly larger in the xy-plane than what you expect to need in the end, and make sure you have enough water outside the membrane to accommodate globular domains. Place the protein in the same coordinate file (and topology) as your lipid bilayer, and make sure it is in the orientation and position you want right in the middle of the bilayer.
The first settings have to be entered in the mdp file that controls your simulation. You need an energy group corresponding to your protein, this group should be frozen (all dimensions), and we should exclude all interactions inside the protein to avoid problems when it is distorted. For instance:
integrator = md
energygrps = Protein
freezegrps = Protein
freezedim = Y Y Y
energygrp_excl = Protein Protein
You will also need a number of settings for the actual membrane
embedding process. These are entered as similar name and value pairs,
but in the separate text data file embed.dat
that you provide as
the argument to the -membed
option (we refer to the below
when explaining the process). The embedding works in for stages:
- The protein is resized around its center of mass by a factor
xy
in the xy-plane (the bilayer plane), and a factorz
along the z-axis (normal to the bilayer). If the height of the protein is the same or smaller than the thickness of the membrane, a z-fraction larger than 1.0 can prevent the protein from being enveloped by the lipids. - All lipid and solvent molecules overlapping with the resized protein are removed. All interactions inside the protein are turned off to prevent numerical issues for small values of the scaling fractions.
- A single md step is performed, where atoms in the rest of the system are moved.
- The resize factors are adjusted by the small amounts
(1-xy)/nxy and (1-z)/nz, where
nxy
andnz
are the number of iterations to use. The resize factor for the xy-plane is adjusted first. The resize factor for the z-direction is not changed until the xy factor is 1.0 (afternxy
iterations). - Steps 3 and 4 are repeated until the protein has again reached its original size, i.e. after nxy+nz iterations. After the embedding you might still want to perform a short relaxation.
Parameters that can be specified in embed.dat
, with default
values that will be used if the setting is omitted:
xyinit
(0.5) Resize factor for the protein in the xy dimension before starting embedding.xyend
(1.0) Final resize factor in the xy dimension.zinit
(1.0) Resize factor for the protein in the z- dimension before starting embedding.
zend
(1.0) Final resize faction in the z dimension.nxy
(1000) Number of iteration for the xy dimension.nz
(0) Number of iterations for the z dimension.rad
(0.22) Probe radius to check for overlap between the group to embed and the membrane.pieces
(1) Perform piecewise resize. Select parts of the group to insert and resize these with respect to their own geometrical center.asymmetry
(no) Allow asymmetric insertion, i.e. the number of lipids removed from the upper and lower leaflet will not be checked.ndiff
(0) Number of lipids that will additionally be removed from the lower (negative number) or upper (positive number) membrane leaflet.maxwarn
(0) Largest number of membed warnings allowed.