I need some time to observe the results of simulation. Then, I can return the reward buffer depending on the simulation results. While observing the simulation results, my agent should not be trained. However, I can’t find a way to pause training. What should I do to solve the problem?
If you don’t have your own RL library designed for this workflow, one thing you could try is to modify the step()
function in vec_task.py
. The RL algorithm calls step()
to retrieve the buffers it needs for training, so you could run multiple simulation steps here in a loop with gym.simulate()
to step simulation and post_physics_step()
to compute the observations.
2 Likes