In the “compute _observations” function of “Franka Cabinet”, the following code can read the position of rigid_body normally
self.franka_lfinger_pos = self.rigid_body_states[:, self.lfinger_handle][:, 0:3]
self.franka_rfinger_pos = self.rigid_body_states[:, self.rfinger_handle][:, 0:3]
I tried to imitate FrankaCabinet, but found that with theses codes the position of rigid_body will change repeatedly between two different positions when the environment is reset(“reset_idx”).
I did not use “gym.get_rigid_transform” and “gymapi.Transform()” to declare variables. Will this have an impact?
Here is the entire code about the variable like “self.franka_lfinger_pos”:
def __init__(self, cfg, sim_device, graphics_device_id, headless):
self.racket_pos=torch.zeros_like(self.rigid_body_states[:, self.racket_handle][:, 0:3])
self.ball_pos=torch.zeros_like(self.rigid_body_states[:, self.racket_handle][:, 0:3])
def compute_observations(self):
self.racket_pos = self.rigid_body_states[:, self.racket_handle][:, 0:3]
self.ball_pos = self.rigid_body_states[:, self.ball_handle][:, 0:3]
self.obs_buf = torch.cat((dof_pos_scaled, self.Robot_dof_vel * self.dof_vel_scale, self.racket_pos, self.ball_pos), dim=-1)
def compute_reward(self, actions):
self.rew_buf[:], self.reset_buf[:] = compute_PingPong_reward(
self.reset_buf, self.progress_buf, self.actions, self.racket_pos,self.ball_pos,
self.num_envs, self.dist_reward_scale,self.action_penalty_scale, self.max_episode_length
)
The values I set are as follows, but I think it should be correct.
def reset_idx(self, env_ids):
ball_pos = self.ball_pos_lower_limits+torch.rand((len(env_ids),1, 3), device=self.device)*(self.ball_pos_higher_limits-self.ball_pos_lower_limits)
self.ball_rigid_pos[env_ids] = ball_pos.squeeze(1)
ball_state=torch.cat((ball_pos,self.default_ball_states[env_ids,...,3:]),dim=-1)
self.ball_root_states[env_ids] = ball_state
self.gym.set_actor_root_state_tensor(self.sim,
gymtorch.unwrap_tensor(self.root_state_tensor)),
self.gym.set_dof_position_target_tensor(self.sim,gymtorch.unwrap_tensor(self.Robot_dof_targets))
self.gym.set_dof_state_tensor(self.sim,gymtorch.unwrap_tensor(self.dof_state))
self.progress_buf[env_ids] = 0
self.reset_buf[env_ids] = 0
As long as I delete these codes in “compute_observations(self)”, the environment can be reset normally.
self.racket_pos = self.rigid_body_states[:, self.racket_handle][:, 0:3]
self.ball_pos = self.rigid_body_states[:, self.ball_handle][:, 0:3]
Besides, if I use local variable , I can also reset the environment correctly.
However, with these method I can’t record the position of the balls and rackets.