IsaacGymEnvs: Cartpole task reward edit

Hi, I’m trying to make standing Acrobot by PPO
I used isaacgymenvs CartPole code

[Library]

GitHub - NVIDIA-Omniverse/IsaacGymEnvs: Isaac Gym Reinforcement Learning Environments

[Copied Code]

IsaacGymEnvs/cartpole.py at main · NVIDIA-Omniverse/IsaacGymEnvs · GitHub


And I tried Custom Reinforcement-Learning Simulation
such as Acrobot But it can’t find right position.
Acrobot_train

So I would like to edit the CartPole’s reward part, but I can’t understand what it means.


Q1. How did reward get this formula?

reward = 1.0 - pole_angle * pole_angle - 0.01 * torch.abs(cart_vel) - 0.005 * torch.abs(pole_vel)


Q2. What is reset & reset_dist means?


Here is my requiring situation
20220303_212520

Hi there,

We mostly hand-crafted the reward function. The main idea is to generate a higher reward when the pole is close to an upright position (i.e. it’s angle is close to 0) and penalize for large movements (represented by velocity). We compute the reset buffer to determine which environments need to be reset to an initial state, this could be due to the environment being in a bad state (pole swung over 90 degrees in either direction or cart moved too far from center), or it has reached the maximum episode length.

1 Like

Thanks a lot
Your Answer is really good to understand.
Do you have any little tips or ideas about Acrobot balancing problem? :)

Thanks Alot ! It Works!!
ffinish

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.