Randomness in Isaac

I created a simulation with 512 nv_ant.xml robots( one robot per environment). At each time step, I applied the same action to all of them, but They perform differently.
Can I let them have the same state after performing the same action?

Are you sure that they are not colliding with each other?

Can you trace where the randomness starts?

In my experience, there is indeed some nondeterminism in IsaacGym, but not in the states.

Note that if you’re processing observations or actions with PyTorch, you should preface your scripts with the following for intentional reproducibility:

import os
import torch

seed = 0

os.environ['CUBLAS_WORKSPACE_CONFIG'] = ':4096:8'
torch.backends.cudnn.benchmark = False
torch.use_deterministic_algorithms(True)
torch.manual_seed(seed)

For completeness, you can also set the seeds or RNG states for numpy and standard Python:

import random
import numpy as np

seed = 0

random.seed(seed)
np.random.seed(seed)

See the PyTorch docs for more information on this topic.

Going back, the nondeterminism I was referring to stems from net contact forces on GPU.

Below are the results I get for the first second of simulation for two repeated runs of my script:

Step Actor state sum (1st) Actor state sum (2nd) Net contact sum (1st) Net contact sum (2nd) Is different
0 2102.695556640625 2102.695556640625 15055.205078125 15055.205078125
1 2100.20166015625 2100.20166015625 14990.115234375 14990.1162109375 x
2 2087.267822265625 2087.267822265625 15164.1875 15164.1875
3 2087.260009765625 2087.260009765625 15098.4248046875 15098.423828125 x
4 2105.942626953125 2105.942626953125 15400.44921875 15400.44921875
5 2106.02978515625 2106.02978515625 15456.9765625 15456.9765625
6 2100.53662109375 2100.53662109375 15464.087890625 15464.0888671875 x
7 2096.452880859375 2096.452880859375 14730.203125 14730.203125
8 2098.6845703125 2098.6845703125 15038.4541015625 15036.212890625 x
9 2102.801513671875 2102.801513671875 15766.615234375 15766.615234375
10 2097.928466796875 2097.928466796875 15373.068359375 15382.4794921875 x
11 2098.5498046875 2098.5498046875 15409.013671875 15409.0146484375 x
12 2104.79443359375 2104.79443359375 15588.66015625 15584.486328125 x
13 2101.228515625 2101.228515625 15661.18359375 15661.18359375
14 2102.44140625 2102.44140625 14947.404296875 14947.404296875
15 2096.845703125 2096.845703125 15455.953125 15455.9521484375 x
16 2101.01123046875 2101.01123046875 15408.615234375 15408.615234375
17 2110.00732421875 2110.00732421875 15084.0888671875 15084.087890625 x
18 2106.254150390625 2106.254150390625 15433.94921875 15433.94921875
19 2099.318115234375 2099.318115234375 15446.5634765625 15446.5634765625
20 2109.6884765625 2109.6884765625 15391.0859375 15391.0859375
21 2098.9775390625 2098.9775390625 15654.236328125 15654.236328125
22 2100.4775390625 2100.4775390625 15398.6064453125 15398.6064453125
23 2109.431640625 2109.431640625 15019.9619140625 15019.962890625 x
24 2103.654296875 2103.654296875 15123.583984375 15122.5498046875 x

See that actor states are consistent, while net contact forces sometimes are, but often aren’t.

I need them for collision detection and this discrepancy is particularly troubling, since it affects my agents’ rewards and therefore model optimisation, as well.

It would be great to see this behaviour confirmed and if there were a feasible solution.

No. They will not collide with each other as they are in different environment.

  1. The randomness starts at the first step. I generate an action for a robot and repeat them by num_envs. After They are stepped one frame. There will be a tiny difference between robots, the error can be accumulated through the steps and produce a totally different situation.
  2. I have fixed all the randomness seed and I do think the randomness of torch or numpy is the problem as all the actors receive the same actions.
  3. Is this the problem of Preview 3.As I as moved from 2 to 3.

one reason which comes to my mind is that maybe actions are not very precisely synchronized, even a Nano second difference between updating different environments might lead to some randomness like behavior.

1 Like

I agree. That could be a chance that the internal synchronization problem leads to this situation.
As I am working on MPC staff. I would hope that the robot can have the same behavior when perform the optimal action at the timestep

1 Like