Hi everyone,
Summary: I have been trying to tune the reward function (both parameters and scales) for better training results. My training is a legged robot based on the Anymal Terrain example. I am wondering if there is a way to log and visualize individual rewards on each parameter throughout the epochs. So far I can only see mean reward using tensorboard or WandB. Original And it doesn’t tell me how the reward scale is changing the model. Is there a good place to start editing? In addition, can I save simulation values like velocity, force, etc. for the agents?
Attempt so far: I have located that the training data, i.e. loss, mean reward, and time, are logged in amp_contineous. Can I just define an extra function to save individual reward terms that are used to calculate the mean reward? Do I have to edit anything else?
I have been really new to RL and IsaacGym, and I am asking this question because it seems like a reasonable way by identifying individual terms in the reward function in order to create useable reward functions. I would really appreciate it if you could also share how to scientifically tune rewards if this is not the way to go.
Thank you in advance.
Cheers,