Hi, I am using the PPO for Isaac-Lift-Franka-v0, which is a task in NVIDIA Isaac Orbit. I found that the performance of PPO in rl_games is much better than skrl. So, I have tried to adjust the skrl’s parameterS as same as rl_games’s, but it does not work. I am wondering if I overlook something or if the architecture of PPO in rl_games and skrl are fundamentally different. Could you provide me any advice or insight about how to make the PPO in skrl performs as well as rl_games?
When the Isaac-Lift-Franka-v0 environment reward function was fixed, only the rl_games and rsl_rl hyperparameters were updated.
Although, for Isaac Orbit, I use, as far as possible, the rl_games hyperparameters, I have updated (in skrl-v1.0.0-rc.2, released recently), the hyperparameters for the Isaac-Lift-Franka-v0 environment but this time based on rsl_rl. Furthermore, I have added, to the last version, the time-limit (episode truncation) bootstrapping to skrl’s on-policy agents, which allows for better mean reward values.
The next plot shows the mean reward for the Isaac-Lift-Franka-v0 environment for the mentioned libraries (and with the skrl hyperparameters updated, not published yet in Isaac Orbit code). Note that, since the number of parallel environments for the lift tasks was increased from 1024 to 4096, rl_games, with the available hyperparameters, takes much longer to train.
I am working on the skrl integration in the Isaac Orbit repository (to be pushed soon) which will include JAX support and an update of the training hyperparameters.
Meanwhile, you can play with the standalone training script for Isaac Orbit from the skrl docs: torch_lift_franka_ppo.py
Both implementations for skrl (Figure 1 and 2) use the same hyperparameters:
Note that the initial_log_std and time_limit_bootstrap are not available in the current public version of Isaac Orbit.
And, I am sorry for an unrelated topic, when I am using skrl for training, such as *Isaac-Lift-Franka-v0, the training will stop halfway, even though it is still far away from exceeding the available GPU memory.
The error is like there is an error running Python. something like that.
Can you provide the error message or logs?
I have attached the error message. The training stopped suddenly.
Have you made any modifications to the task?
I am using my own environment. But it seems only skrl has this problem. rl_games is working well with the environment.
Are you using the latest skrl version?
I am using the 0.10.0 version.
Mmmm, I have never had these types of problems with Isaac Orbit, but perhaps it could be something similar to what is described in the following discussion (which is fixed in latest skrl versions).
Can you try the latest version (skrl-v1.0.0-rc.2)?
Are you running the example scripts included in the skrl (e.g.: torch_lift_franka_ppo.py), or the examples integrated in Isaac Orbit?
Hi, may I know is the latest version, skrl-v1.0.0, is ready-to-use for Issac Orbit?
Previously Isaac Orbit supports the skrl-v0.10.2 version, so now is skrl-v1.0.0 ready-to-use for Issac Orbit? Do we need to modify anything before using it?
I tried the latest version in Orbit, where I just changed the structure environment loaders and wrappers file hierarchy (link attached).
It runs successfully. However, with exactly the same parameters (parameters you provided) and environments (same tasks), 1.0.0 and 0.10.2 their performance are quite different, in which in 0.10.2, my robot can successfully grasp the object, whereas 1.0.0 absolutely does not.
May I know if you have any clue? Is there anything I missed?