I need some time to observe the results of simulation. Then, I can return the reward buffer depending on the simulation results. While observing the simulation results, my agent should not be trained. However, I can’t find a way to pause training. What should I do to solve the problem?
If you don’t have your own RL library designed for this workflow, one thing you could try is to modify the
step() function in
vec_task.py. The RL algorithm calls
step() to retrieve the buffers it needs for training, so you could run multiple simulation steps here in a loop with
gym.simulate() to step simulation and
post_physics_step() to compute the observations.