How to train the model when resetting the environment instead of training the model for every step

780927810 · December 17, 2021, 1:55pm

I need some time to observe the results of simulation. Then, I can return the reward buffer depending on the simulation results. While observing the simulation results, my agent should not be trained. However, I can’t find a way to pause training. What should I do to solve the problem?

kellyg · December 17, 2021, 4:01pm

If you don’t have your own RL library designed for this workflow, one thing you could try is to modify the step() function in vec_task.py. The RL algorithm calls step() to retrieve the buffers it needs for training, so you could run multiple simulation steps here in a loop with gym.simulate() to step simulation and post_physics_step() to compute the observations.