my ultimate goal is to apply actions for a random time period [0,1] and only update the action after this time period. However, I do not see the possibility to implement this because the actions are “automatically” generated somewhere and applied in pre_physics_step(). The script for generating the actions seems to be encrypted…right?
Do you have any tips?
What drive mode are you using? pos, vel, effort? if you are using pos or vel if you don’t update the value it’s like you are not applying any new action, position and velocity targets don’t need to be set every frame, only when changing targets. If you are using effort mode you should update it at each frame, so if you don’t want to apply a new action it means you have to keep the previous action during that random time period.
I am using the pos mode. Even, if I am not manually applying the action, the updated action will be part of my observation buffer, which will confuse my agent probably: “Oh, I updated my actions (according to my observation), but there has been no result…?”
So, the question is how to prevent the actions to be updated in the root.
you mean you don’t want to update(train) your agent during a time period or you want to force the model to hold a state(freeze) during a time period?
Assuming the action values stand for current values which control the joints. Updating the action values everytime might not be realistic, as the system might not be that fast due to the system reaction and communication delay. So yes, I want the agent not to be updated until for instance t= 250ms.
Depending on what RL library you are using this could be very different, in case of Leggedgym there is a file on_policy_runner.py, the main learn function is defined inside this file and it gets the data returned from the simulation which is happening inside legged_robot.py I think. you have to add a another parameter to the returning values which tells the train function it should pause if it’s 0, or you can just add an extra observation parameter and check that parameter inside the train function and if it’s 0 skip the rest of the code or skip some part of it, in this way you can have control over the situation where you don’t want your agent to be trained. I’m sure it’s not easy to add this functionality to leggedgym library!! don’t know about other libraries, they might even have this functionality since it makes sense to skip training in some situations.
Another question is that if you want all the actors frozen during that no-training moments or they can freely go the next states? it doesn’t make sense if all the actors freeze, it’s like pausing the whole program!
I think I was bad at explaining, let me try again ;) My custom robot is super slow due to the system delay and communication lags etc… You give a system input, it takes around 250ms to rotate its joint. When I try to implement this behavior in ISAAC GYM, it is a little bit tricky, because here the system reacts as soon as you provide the input. And the next step begins with new actions. Just ignoring these new actions by simply overwriting them with 0 in the training script will definitely confuse the agent. So I was looking for a way around ;)
oh! it’s a totally different story!
So you have a real robot and you want to model the latency!
What’s the update rate of the robot? obviously it’s different from latency.