GROMACS Molecular Dynamics simulations run increasingly slower as simulation progresses

I use GROMACS 2023.3 with CUDA 12.3 on my Ubuntu 22.04 computer with a single RTX 4090.
For my desired 1us-long membrane simulation, I chunk it up into ten separate 100ns-long simulations.

For each of the ten 100ns-long simulations, I run the following terminal command:

gmx mdrun -v -deffnm ${istep} -nb gpu -bonded gpu -pme gpu -update gpu

The GROMACS .mdp is as follows:

integrator              = md
dt                      = 0.004
nsteps                  = 25000000
nstxout-compressed      = 25000
nstxout                 = 0
nstvout                 = 0
nstfout                 = 0
nstcalcenergy           = 100
nstenergy               = 5000
nstlog                  = 5000
;
cutoff-scheme           = Verlet
nstlist                 = 20
rlist                   = 1.2
vdwtype                 = Cut-off
vdw-modifier            = Force-switch
rvdw_switch             = 1.0
rvdw                    = 1.2
coulombtype             = PME
rcoulomb                = 1.2
;
tcoupl                  = v-rescale
tc_grps                 = MEMB SOLV
tau_t                   = 1.0 1.0
ref_t                   = 303.15 303.15
;
pcoupl                  = C-rescale
pcoupltype              = semiisotropic
tau_p                   = 5.0
compressibility         = 4.5e-5  4.5e-5
ref_p                   = 1.0     1.0
;
constraints             = h-bonds
constraint_algorithm    = LINCS
continuation            = yes
;
nstcomm                 = 100
comm_mode               = linear
comm_grps               = MEMB SOLV

As can be seen in the below plot, when I simulate a 100-total-molecule membrane system (orange line below), the simulation time for each progressive 100ns run steadily increases, going from 100 minutes to over 250 minutes.

I tried halving the size of my membrane system to 50-total-molecules (blue line below). This seemed to largely mitigate the issue of increasing run times as the GPU is used over time.

Why is it that the simulations incrementally slow over time for my larger membrane system? As recommended by this NVIDIA blog by Alan Gray, I played around with the nstlist parameter and found no difference; the issue persisted.

From the GROMACS side, do you have any recommendations for other .mdp parameters to try?

Alternatively or in combination, is there anything I could do in between each of the ten 100-ns-long runs, like a “cool-down” period for the GPU (i.e., do nothing for a few minutes)? Or perhaps clear GPU cache?

Thanks in advance!

Best,
Evan

The Gromacs experts are here: https://gromacs.bioexcel.eu/

I have already posted the issue on the GROMACS forum but no help was provided. This problem involves both GROMACS and CUDA, so I thought I would post here too.

To ask a more targeted GPU question… as mentioned each simulation increasingly gets slower. But, if I restart my computer, the simulation times is back to its optimal (i.e., what it was for the first simulation run around ~100 minutes). What does restarting my computer do to the GPU as to why it would be more efficient? Does it clear/reset cache (if that is even a pertinent term here)? Is there anyway I could do this via terminal command to prevent needing to restart computer?

Thanks

None of the data presented suggests to me that the slowdown has anything to do with the GPU.

One thing you could check, as a generic reason for unexpected GPU slowdowns over time, is whether the GPU is overheating, causing thermal throttling to be applied. The output of nvidia-smi will show current temperature as well as any throttle reasons. I do not operate an RTX 4090, but to my knowledge, this particular GPU is not prone to overheating, with temperature staying below 80°C even under heavy load.