I changed the numEnvs from 1024 to 2 and run the following command.
… python rlg_train.py --task Ant
And then, I got the following error.
assert(self.batch_size % self.minibatch_size == 0)
As far as I understand, I should also change or adapt batch_size and minibatch_size to get rid of this this error. But, I could not find these two variables in any file. Does anybody know where I can find them?
Thanks in advance!
I get the same error as well, when changing num_envs and running any RL example.
I am on a completely fresh install of Ubuntu 20.04, so I don’t really understand what’s wrong.
As for the batch_size and minibatch_size, check out this thread, maybe this works for you?: Gym cuda error: running out of memory - #4 by toni.sm
Minibatch is in the PPO YAML. I’ve had the error before, it seems somewhat inaccurate - it usually appears when I drastically reduce the number of environments as though it doesn’t understand that 2 goes into any minibatch that 1024 goes into.
When I’ve reduced the minibatch it works, though if its too close to the number of environments it appears to cause pauses in the physics rendering (I just made a post about the pauses this morning and figured out this relationship as I looked into it more).
This is a requirement of the rl_games RL library. Make sure that your the parameters in the training yaml config file (under isaacgymenvs/cfg/train) satisfies the assertion requirement of
self.batch_size % self.minibatch_size == 0.
batch_size is computed as
horizon_length * num_envs.