Non-Deterministic Behavior when Setting Data via Tensor API

Hello,

I am currently tracing an interesting behavior of Isaac Gym in my simulation and hoped to get some help on this here.

I have an environment with a Kuka IIWA robot, a table and multiple objects on it and I am currently facing unexpected non-deterministic behavior of the simulator. I already reduced the amount of code to a somewhat minimal case that translates to the following (schematic) code:

seed = 0
torch.manual_seed(seed)
np.random.seed(seed)
random.seed(seed)

# ....

def fetch_results(self):
        self.gym.fetch_results(self.sim, True)
        self.gym.refresh_jacobian_tensors(self.sim)
        self.gym.refresh_mass_matrix_tensors(self.sim)
        self.gym.refresh_dof_state_tensor(self.sim)
        self.gym.refresh_actor_root_state_tensor(self.sim)
        self.gym.refresh_rigid_body_state_tensor(self.sim)

# ....

# Create the actors and activate the tensor API
self.create_envs()
self.gym.prepare_sim(self.sim)

# Get tensor interfaces for simulation
self._jacobian = gym.acquire_jacobian_tensor(sim, robot_name)
self._massmatrix = gym.acquire_mass_matrix_tensor(sim, robot_name)
self._root_states = gym.acquire_actor_root_state_tensor(sim)
self._dof_states = gym.acquire_dof_state_tensor(sim)
self._rb_states = gym.acquire_rigid_body_state_tensor(sim)

self.jacobian = gymtorch.wrap_tensor(self._jacobian)
self.massmatrix = gymtorch.wrap_tensor(self._massmatrix)
self.dof_states = gymtorch.wrap_tensor(self._dof_states).view(self.num_envs, -1, 2)
self.root_states = gymtorch.wrap_tensor(self._root_states)
self.rb_states = gymtorch.wrap_tensor(self._rb_states)

# Sync the buffers
self.fetch_results()

################ This is the critical part #######################
self.gym.set_dof_state_tensor(self.sim, self._dof_states)
self.gym.set_actor_root_state_tensor(self.sim, self._root_states)
##################################################################

# Do one step of simulation and sync the buffers again
self.gym.simulate(self.sim)
self.fetch_results()

If I run the above script multiple times, the DOF and ActorRoot state are always identical after environment creation. However, when performing one simulation step, the results differ if I execute the critical part, i.e. set the data that I just synced from the simulation before executing the simulation step. Furthermore, the differences are not just minor numerical differences but are rather large in magnitude, for example:

DOF State in Env 42 after Simulation Step (Run 1). First column is position, second is velocity:

       [[-2.1091e-08, -4.8297e-05],
        [ 7.9399e-09, -7.1478e-05],
        [-5.2626e-08,  1.4010e-04],
        [-1.2692e-08,  2.3726e-04],
        [-1.3608e-07, -2.4861e-05],
        [ 8.9806e-09, -2.7905e-04],
        [ 1.0058e-10, -3.8097e-06],
        [-4.3178e-08,  3.2610e-04],
        [ 5.0000e-02,  3.9473e-06],
        [-3.9407e-08, -1.5118e-04],
        [ 5.0000e-02, -2.2496e-05]]

DOF State in Env 42 after Simulation Step (Run 2). First column is position, second is velocity:

       [[ 1.1235e+00, -7.3033e-01],
        [-6.7171e-01,  1.3376e+00],
        [ 1.7379e+00, -1.1422e-01],
        [-1.4495e-01,  1.4970e-02],
        [-1.9888e+00, -3.1297e-01],
        [ 3.2727e-06, -1.5659e-03],
        [-5.7205e-07,  1.2782e-04],
        [-6.7996e-10, -4.3063e-08],
        [ 8.9829e-02,  2.0615e+00],
        [-2.2800e-06,  4.6914e-04],
        [ 3.0061e-01, -2.8294e+00]]

In this case, the 2nd run looks faulty since the initial joint position is [0, 0, 0, 0, 0, 0 ,0, 0, 0.05, 0, 0.05] and the target position of the PD controller has been set to all zeros (driving the two non-zero initialized joints into their limit at 0.05).

Furthermore, the environments in which the states differ between runs also change between runs, although they do not seem to be completely random and typically make up around 1% of all simulated environments. To give an example: If I run the above script multiple times, sometimes only environment 42 differs but sometimes both environments 42 and 43.

Did anyone here face similar issues so far or does someone from the dev team has an idea of how to further debug the behavior?

Best Regards
Pascal

We discussed a similar random behavior in another post as I remember but didn’t get any answer from Nvidia team. The random behavior might be due to the solver algorithm or memory management or the synchronization of the environments.
Another thing is that applying set_dof_state_tensor at each step of simulation is considered “very dangerous” in the documents, you are working against the simulation somehow. Do you read the return from set_dof_state_tensor to make sure that the operation was successful for all the environments?

Hi Abi!

Thanks for your answer. I wasn’t aware that the “set methods” return a boolean flag. However, I now checked the return values and they indicate success of the operation.

But given your answer I realized that there exist indexed versions of the set_dof_state_tensor/set_actor_root_state_tensor methods which are anyways better suited to my needs (as I only need to asynchronously reset specific environments from time to time). Replacing set_dof_state_tensor and set_actor_root_state_tensor by their indexed version seemed to fix the problem in my case.

I will further evaluate this “fix” but for now, I suggest anyone coming across the described problem to try out the indexed versions.

Thanks again for your comment!

1 Like

Thanks for sharing! so you wanted to use it for resetting some environments!, indexed versions of the set_dof_state_tensor/set_actor_root_state_tensor is exactly designed for this purpose. If you look at the Legged gym library you can see how they have used this function.