Looking at the examples provided in Omniisaacgymenvs I have noticed a common theme whereby, in the
pre_physics_step, the reset buffer is checked and any necessary environment resets are triggered. For example, in the Cartpole task, the
pre_physics_step begins as follows:
def pre_physics_step(self, actions) -> None: if not self._env._world.is_playing(): return reset_env_ids = self.reset_buf.nonzero(as_tuple=False).squeeze(-1) if len(reset_env_ids) > 0: self.reset_idx(reset_env_ids)
However, this means that after resetting an environment, the action that is executed in that environment corresponds to the last observation before the reset. I was wondering if this is indeed the case or perhaps I am missing something. Shouldn’t the environment resets be handled in the post physics step and before the observations are fed through the RL model to obtain the actions?