Based on Custom RL Example using Stable Baselines , multiple envs wrong!

I followed the tutorial Custom RL Example using Stable Baselines to build my environment,but it doesn’t work. I copied the code and ran the cartpole example. When num_envs is 1, it can run, but when I change num_envs, it reports an error. My world contains multiple envs, but when wrapping the env, I check the vecenc, it seems that i only have one environment, and the output actions returned is one-dimensional


Hi @2456496590

For the Custom RL example applied to multiple environments, this post may help:

Hi,
i tried to run the code provided (cartpole_train.py) in combination with the proposed changes to the cartpole_task.py.

While setting self.num_envs = 8, I am getting the following error:

Traceback (most recent call last):
  File "cartpole_train1.py", line 179, in <module>
    trainer.train()
  File "/home/kwrede/.local/share/ov/pkg/isaac_sim-2022.2.1/kit/python/lib/python3.7/site-packages/skrl/trainers/torch/sequential.py", line 75, in train
    self.single_agent_train()
  File "/home/kwrede/.local/share/ov/pkg/isaac_sim-2022.2.1/kit/python/lib/python3.7/site-packages/skrl/trainers/torch/base.py", line 155, in single_agent_train
    states, infos = self.env.reset()
  File "cartpole_train1.py", line 114, in reset
    return self._observation_to_tensor(self._env.reset()), {}
  File "/home/kwrede/.local/share/ov/pkg/isaac_sim-2022.2.1/exts/omni.isaac.gym/omni/isaac/gym/vec_env/vec_env_base.py", line 158, in reset
    self._task.reset()
  File "/home/kwrede/easscratch/Omniverse/isaac_sim_examples/forum_sb3/cartpole_task.py", line 95, in reset
    self._cartpoles.set_joint_positions(dof_pos, indices=indices)
  File "/home/kwrede/.local/share/ov/pkg/isaac_sim-2022.2.1/exts/omni.isaac.core/omni/isaac/core/articulations/articulation_view.py", line 570, in set_joint_positions
    positions, device=self._device
IndexError: index 1 is out of bounds for dimension 0 with size 1
2023-09-15 07:51:35 [4,401ms] [Warning] [carb.audio.context] 1 contexts were leaked
/home/kwrede/.local/share/ov/pkg/isaac_sim-2022.2.1/python.sh: line 41: 79590 Segmentation fault      (core dumped) $python_exe "$@" $args
There was an error running python

Is this Error coming from changes in the articulation_view.py since last year or am I doing something wrong?
I can’t attach my code sadly while beeing stuck in Processing Upload :(

Best regards and thanks in advance!

cartpole_play.py (995 Bytes)
cartpole_train.py (6.7 KB)
cartpole_task.py (6.5 KB)

I was finally able to upload my files, here they are :)

Hi @191kaw

Sorry for late response

Please, note that to work with multiple environment it is necessary to create clones using the GridCloner as shown in the snippet.
In the files.zip you can find the modified task.

from omni.isaac.cloner import GridCloner

# ...

    def set_up_scene(self, scene) -> None:
        # retrieve file path for the Cartpole USD file
        assets_root_path = get_assets_root_path()
        usd_path = assets_root_path + "/Isaac/Robots/Cartpole/cartpole.usd"
        # add the Cartpole USD to our stage
        create_prim(prim_path="/World/Cartpole_0", prim_type="Xform", position=self._cartpole_position)
        add_reference_to_stage(usd_path, "/World/Cartpole_0")
        # create a GridCloner instance
        cloner = GridCloner(spacing=5)
        target_paths = cloner.generate_paths("/World/Cartpole", self.num_envs)
        position_offsets = np.zeros([self.num_envs, 3]) + np.array([0, 0, self._cartpole_position[2]])
        cloner.clone(source_prim_path="/World/Cartpole_0", prim_paths=target_paths, position_offsets=position_offsets)
        # create an ArticulationView wrapper for our cartpole - this can be extended towards accessing multiple cartpoles
        self._cartpoles = ArticulationView(prim_paths_expr="/World/Cartpole*", name="cartpole_view")
        # add Cartpole ArticulationView and ground plane to the Scene
        scene.add(self._cartpoles)
        scene.add_default_ground_plane()

        # set default camera viewport position and target
        self.set_initial_camera_params()

Again, the recommended solution is to move to the RL Framework defined in the OmniIsaacGymEnvs for working with multiples environments in Isaac Sim.

Thanks for your reply! I’ve missed the point about the cloner, using this your provided example works :)

However, I am trying to get this example running with SB3 and multiple parallel envs similar to this approach: Unable to train multi environment robot - #11

I wonder if SB3 is capable of a parallel simulation, because of the following error:

Using cuda:0 device
Wrapping the env with a `Monitor` wrapper
Wrapping the env in a DummyVecEnv.
Traceback (most recent call last):
  File "cartpole_train.py", line 38, in <module>
    model.learn(total_timesteps=100000)
  File "/home/kwrede/.local/share/ov/pkg/isaac_sim-2022.2.1/kit/python/lib/python3.7/site-packages/stable_baselines3/sac/sac.py", line 319, in learn
    progress_bar=progress_bar,
  File "/home/kwrede/.local/share/ov/pkg/isaac_sim-2022.2.1/kit/python/lib/python3.7/site-packages/stable_baselines3/common/off_policy_algorithm.py", line 350, in learn
    progress_bar,
  File "/home/kwrede/.local/share/ov/pkg/isaac_sim-2022.2.1/kit/python/lib/python3.7/site-packages/stable_baselines3/common/off_policy_algorithm.py", line 324, in _setup_learn
    progress_bar,
  File "/home/kwrede/.local/share/ov/pkg/isaac_sim-2022.2.1/kit/python/lib/python3.7/site-packages/stable_baselines3/common/base_class.py", line 489, in _setup_learn
    self._last_obs = self.env.reset()  # pytype: disable=annotation-type-mismatch
  File "/home/kwrede/.local/share/ov/pkg/isaac_sim-2022.2.1/kit/python/lib/python3.7/site-packages/stable_baselines3/common/vec_env/dummy_vec_env.py", line 64, in reset
    self._save_obs(env_idx, obs)
  File "/home/kwrede/.local/share/ov/pkg/isaac_sim-2022.2.1/kit/python/lib/python3.7/site-packages/stable_baselines3/common/vec_env/dummy_vec_env.py", line 94, in _save_obs
    self.buf_obs[key][env_idx] = obs
ValueError: could not broadcast input array from shape (8,4) into shape (4,)
2023-10-02 13:59:11 [5,541ms] [Warning] [carb.audio.context] 1 contexts were leaked
/home/kwrede/.local/share/ov/pkg/isaac_sim-2022.2.1/python.sh: line 41: 2953058 Segmentation fault      (core dumped) $python_exe "$@" $args
There was an error running python

Or is it just about formatting of the observations? PFA code for the task with cloner and the training with SB3

sb3_parallel.zip (3.0 KB)

Best regards,
Konstantin

Hi @191kaw

I think you need to wrap the environment with some stable_baselines3 VecEnv functionality to work with parallel environments as it is done in Isaac Orbit Sb3VecEnvWrapper.

off-topic advertising: why sb3 if there is skrl? 😅

Thanks for pointing out Isaac Orbit! I am currently trying it out.

The reason I wanted to use sb3 over skrl is that I have worked with sb3 in previous projects. Further it seems that the given skrl implementation of SAC within the rl_games is not working properly as I mentioned here:

Best regards,
Konstantin