I would like to share, in this topic and in a more official way, the RL library (previously mentioned in this post) that we are developing/using in our lab…
skrl is an open-source modular library for Reinforcement Learning written in Python (using PyTorch) and designed with a focus on readability, simplicity, and transparency of algorithm implementation. In addition to supporting the Gym interface, it allows loading and configuring NVIDIA Isaac Gym environments, enabling agents’ simultaneous training by scopes (subsets of environments among all available environments), which may or may not share resources, in the same run."
Please, visit the documentation for usage details and examples
skrl version 0.3.0 is now available (it is under active development. Bug detection and/or correction, feature requests and everything else are more than welcome: Open a new issue on GitHub!)
Added:
DQN and DDQN agents
Export memory to files
Postprocessing utility to iterate over memory files
Model instantiator utility to allow fast development
skrl version 0.4.0 is now available (it is under active development. Bug detection and/or correction, feature requests and everything else are more than welcome: Open a new issue on GitHub! ). Please refresh your browser (Ctrl + Shift + R) if the documentation is not displayed correctly
Added
CEM, SARSA and Q-learning agents
Tabular model
Parallel training using multiprocessing
Isaac Gym utilities
Changed
Initialize agents in a separate method
Change the name of the networks argument to models
Fixed
Reset environments after post-processing
As part of the Isaac Gym utilities, a lightweight web viewer is available for development without X server
skrl version 0.5.0 is now available (it is under active development. Bug detection and/or correction, feature requests and everything else are more than welcome: Open a new issue on GitHub! ). Please refresh your browser (Ctrl + Shift + R or Ctrl + F5) if the documentation is not displayed correctly
Added
TRPO agent
DeepMind environment wrapper
KL Adaptive learning rate scheduler
Handle gym.spaces.Dict observation spaces (OpenAI Gym and DeepMind Gym environments)
Forward environment info to agent record_transition method
Expose and document the random seeding mechanism
Define rewards shaping function in agents’ config
Define learning rate scheduler in agents’ config
Improve agent’s algorithm description in documentation (PPO and TRPO at the moment)
Changed
Compute the Generalized Advantage Estimation (GAE) in agent _update method
Move noises definition to resources folder
Update the Isaac Gym examples
Removed
compute_functions for computing the GAE from memory base class
Now, with the implementation of the learning rate scheduler and the reward shaper (both adapted from rl_games), a comparable performance with rl_games is achieved…
For example, for the Cartpole environment (RTX 3080)
Thanks for sharing your RL library! I am researching different options and I am wondering how skrl compares to for example Stable-Baselines. Why did you decide to develop a new library instead of writing support for IsaacGym envs in Stable-Baselines?
The web viewer you developed looks great and will be very useful!
The decision on the creation of a new library is described in the skrl statement:
skrl is an open-source modular library for Reinforcement Learning written in Python (using PyTorch) and designed with a focus on readability, simplicity, and transparency of algorithm implementation. In addition to supporting the OpenAI Gym and DeepMind environment interfaces, it allows loading and configuring NVIDIA Isaac Gym environments, enabling agents’ simultaneous training by scopes (subsets of environments among all available environments), which may or may not share resources, in the same run
I wanted to have an RL library where the implementation of the algorithms would be readable, transparent, and uncomplicated. This with the goal of allowing me, and anyone else who wants to give the library a chance to understand and access the code without having to navigate a deep hierarchy of files
For example in Stable-Baselines, if you want to understand the DDPG you need to inspect:
I wanted each algorithm to be implemented independently, even when they might share most of their code with other algorithms. For example the case of DDPG and TD3 or DQN and DDQN
I wanted to have a modular and a well-organized library, where each file was separated and classified according to its functionality and not stacked in a common folder
I wanted to have a library written end-to-end using the same machine learning framework and with the possibility of scaling to other frameworks (like Tensorflow) in a simple and organized way
I wanted to exploit the amazing features offered by Isaac Gym without neglecting the classics
I wanted to learn from the experience of developing something meaningful to me and share it with the community in the hope that it might be useful to someone else in some way or another.
I wanted to do all that and much more, and at that moment the best I can give is skrl, which is not even close to being half good, but it has all my effort and dedication. And… if I wanted to do all that with the current implementations I had to rewrite all the code, and the way I found to do it was to start from scratch.
Thanks for your reply and clearing up the design methodology differences between skrl and stable-baselines. For sure, skrl looks very well-organized and modular, also when compared to stable-baselines. Thank you for the time you put into developing skrl and for sharing it with the community! I will be sure to test it out :)
skrl version 0.6.0 is now available (it is under active development. Bug detection and/or correction, feature requests and everything else are more than welcome: Open a new issue on GitHub! ). Please refresh your browser (Ctrl + Shift + R or Ctrl + F5) if the documentation is not displayed correctly
skrl version 0.7.0 is now available (it is under active development. Bug detection and/or correction, feature requests and everything else are more than welcome: Open a new issue on GitHub! ). Please refresh your browser (Ctrl + Shift + R or Ctrl + F5) if the documentation is not displayed correctly
Added
A2C agent
Isaac Gym (preview 4) environment loader
Wrap an Isaac Gym (preview 4) environment
Support for OpenAI Gym vectorized environments
Running standard scaler for input preprocessing
Installation from PyPI (pip install skrl)
Now, with the implementation of the standard scaler (adapted from rl_games), better performance is achieved…
E.g, for the Ant environment
[ORANGE]: PPO agent with input preprocessor
[BLUE] PPO agent without input preprocessors
skrl version 0.8.0 is now available (it is under active development. Bug detection and/or correction, feature requests, and everything else is more than welcome: Open a new issue on GitHub! ). Please refresh your browser (Ctrl + Shift + R or Ctrl + F5) if the documentation is not displayed correctly
Added
AMP agent for physics-based character animation
Manual trainer
Gaussian model mixin
Support for creating shared models
Parameter role to model methods
Wrapper compatibility with the new OpenAI Gym environment API (by @JohannLange)
Internal library colored logger
Migrate checkpoints/models from other RL libraries to skrl models/agents
Configuration parameter store_separately to agent configuration dict
Set random seed and configure deterministic behavior for reproducibility
Benchmark results for Isaac Gym and Omniverse Isaac Gym on the GitHub discussion page
Franka Emika real-world example
Changed
Models implementation as Python mixin [breaking change]
Multivariate Gaussian model (GaussianModel until 0.7.0) to MultivariateGaussianMixin
Trainer’s cfg parameter position and default values
Show training/evaluation display progress using tqdm (by @JohannLange)
Update Isaac Gym and Omniverse Isaac Gym examples
Fixed
Missing recursive arguments during model weights initialization
Tensor dimension when computing preprocessor parallel variance
Models’ clip tensors dtype to float32
Removed
Parameter inference from model methods
Configuration parameter checkpoint_policy_only from agent configuration dict
As a showcase for the basic Franka Emika real-world example, a simulated version of the environment for Isaac Gym and Omniverse Isaac Gym are provided to support advanced implementations :)
skrl version 0.9.0 is now available (it is under active development. Bug detection and/or correction, feature requests, and everything else is more than welcome: Open a new issue on GitHub! ). Please refresh your browser (Ctrl + Shift + R or Ctrl + F5) if the documentation is not displayed correctly
Added
Support for Farama Gymnasium interface
Wrapper for robosuite environments
Weights & Biases integration (by @juhannc)
Set the running mode (training or evaluation) of the agents
Allow clipping of the gradient norm for DDPG, TD3, and SAC agents
Initialize model biases
Add RNN (RNN, LSTM, GRU, and any other variant) support for A2C, DDPG, PPO, SAC, TD3, and TRPO agents
Allow disabling training/evaluation progressbar
Farama Shimmy and robosuite examples
KUKA LBR iiwa real-world example
Changed
Forward model inputs as a Python dictionary [breaking change]
Returns a Python dictionary with extra output values in model calls [breaking change]
Adopt the implementation of terminated and truncated over done for all environments
Fixed
Omniverse Isaac Gym simulation speed for the Franka Emika real-world example
Call agents’ method record_transition instead of the parent method
to allow storing samples in memories during the evaluation
Move TRPO policy optimization out of the value optimization loop
Access to the categorical model distribution
Call reset only once for Gym/Gymnasium vectorized environments
Removed
Deprecated method start in trainers
As a showcase, a basic real-world example is provided where a KUKA LBR iiwa robot is controlled through a direct API and via ROS/ROS2. In addition, its simulated version, in Omniverse Isaac Gym, is provided to support advanced implementations :)
skrl version 0.10.0 is now available (it is under active development. Bug detection and/or correction, feature requests, and everything else is more than welcome: Open a new issue on GitHub! ). Please refresh your browser (Ctrl + Shift + R or Ctrl + F5) if the documentation is not displayed correctly
This unexpected new version has focused on supporting the training and evaluation of reinforcement learning algorithms in NVIDIA Isaac Orbit
Hi,
Can you suggest an easy implementation to create a policy ensemble using skrl library(say we have 3 agents pre trained on a certain environment. I wanted to create a policy that picks the best action from the given 3 and update the weights for all the 3 agents).
skrl version 1.0.0-rc.1 is now available (it is under active development. Bug detection and/or correction, feature requests, and everything else is more than welcome: Open a new issue on GitHub! ). Please refresh your browser (Ctrl + Shift + R or Ctrl + F5) if the documentation is not displayed correctly.
Among the major features of this new version are:
JAX support
Multi-agent training (the beginning).
Comprehensive documentation with new structure and theme
Visit the library documentation to start training your favorite task in Isaac Gym preview, Isaac Orbit or Omniverse Isaac Gym using PyTorch or JAX!!!
For questions and others, please open a new discussion in the skrl repository.
That way we leave this topic for announcements :)
The execution times used to construct this graph are the complete execution of the script
(including environment loading and, in the case of JAX, the jit compilation of XLA, for example).