Error running rlg_train.py

Hello,

I’ve been proceeding through the installation instructions for Isaac Gym. After successfully running all of the examples as well as some Cartpole training with train.py inside rlgpu, I wanted higher performance and so followed the instructions to install rl_games.

When I use rlg_train.py on a simple Cartpole example I get the following error:

(rlgpu) ~/isaacgym/python/rlgpu$ python rlg_train.py --task Cartpole --headless
Importing module 'gym_37' (/home/jfoster/isaacgym/python/isaacgym/_bindings/linux-x86_64/gym_37.so)
Setting GYM_USD_PLUG_INFO_PATH to /home/jfoster/isaacgym/python/isaacgym/_bindings/linux-x86_64/usd/plugInfo.json
PyTorch version 1.8.1
Device count 1
/home/jfoster/isaacgym/python/isaacgym/_bindings/src/gymtorch
Using /home/jfoster/.cache/torch_extensions as PyTorch extensions root...
Emitting ninja build file /home/jfoster/.cache/torch_extensions/gymtorch/build.ninja...
Building extension module gymtorch...
Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
ninja: no work to do.
Loading extension module gymtorch...
Importing module 'rlgpu_37' (/home/jfoster/isaacgym/python/isaacgym/_bindings/linux-x86_64/rlgpu_37.so)
Setting seed: 1330
Started to train
Python
Not connected to PVD
+++ Using GPU PhysX
Physics Engine: PhysX
Physics Device: cuda:0
GPU Pipeline: enabled
/home/jfoster/miniconda3/envs/rlgpu/lib/python3.7/site-packages/gym/logger.py:34: UserWarning: WARN: Box bound precision lowered by casting to float32
  warnings.warn(colorize("%s: %s" % ("WARN", msg % args), "yellow"))
RL device:  cuda:0
512
1
4
0
Box([-1.], [1.], (1,), float32) Box([-inf -inf -inf -inf], [inf inf inf inf], (4,), float32)
Env info:
{'action_space': Box([-1.], [1.], (1,), float32), 'observation_space': Box([-inf -inf -inf -inf], [inf inf inf inf], (4,), float32)}
Traceback (most recent call last):
  File "rlg_train.py", line 167, in <module>
    runner.run(vargs)
  File "/home/jfoster/isaacgym/python/rlgpu/rl_games/rl_games/torch_runner.py", line 139, in run
    self.run_train()
  File "/home/jfoster/isaacgym/python/rlgpu/rl_games/rl_games/torch_runner.py", line 122, in run_train
    agent = self.algo_factory.create(self.algo_name, base_name='run', config=self.config)  
  File "/home/jfoster/isaacgym/python/rlgpu/rl_games/rl_games/common/object_factory.py", line 15, in create
    return builder(**kwargs)
  File "/home/jfoster/isaacgym/python/rlgpu/rl_games/rl_games/torch_runner.py", line 23, in <lambda>
    self.algo_factory.register_builder('a2c_continuous', lambda **kwargs : a2c_continuous.A2CAgent(**kwargs))
  File "/home/jfoster/isaacgym/python/rlgpu/rl_games/rl_games/algos_torch/a2c_continuous.py", line 18, in __init__
    a2c_common.ContinuousA2CBase.__init__(self, base_name, config)
  File "/home/jfoster/isaacgym/python/rlgpu/rl_games/rl_games/common/a2c_common.py", line 966, in __init__
    A2CBase.__init__(self, base_name, config)
  File "/home/jfoster/isaacgym/python/rlgpu/rl_games/rl_games/common/a2c_common.py", line 124, in __init__
    self.kl_threshold = config['kl_threshold']
KeyError: 'kl_threshold'

I’ve been trying to track this down – I think my rl_games installation isn’t too good… any ideas?

Don’t be like me folks! Read the READMEs fully before crying foul. According to the “Known Issues” section of the rl_games repo:

Starting from rl-games 1.1.0 old yaml configs won’t be compatible with the new version: steps_num should be changed to horizon_length and lr_threshold to kl_threshold

Solving the issue was as simple as going into all the rlg_*.yaml files in python/rlgpu/cfg/train/rlg, and then searching for and changing these fields accordingly.

3 Likes

Yes, there has been an update in the rl_games repo. We will be providing updated yaml config files in our next release.