Crash during train

I am training using RL with omniverse isaac, during the training with Headless=True, I got this error

2022-10-16 12:41:01 [81,972ms] [Warning] [omni.physx.plugin] Invalid PhysX transform detected for /World/Head.
2022-10-16 12:41:01 [81,972ms] [Warning] [omni.physx.plugin] Invalid PhysX transform detected for /World/kr210_r3100_ultra/axis_2_kr210_r3100_ultra.
2022-10-16 12:41:01 [81,972ms] [Warning] [omni.physx.plugin] Invalid PhysX transform detected for /World/kr210_r3100_ultra/axis_3_kr210_r3100_ultra.
2022-10-16 12:41:01 [81,972ms] [Warning] [omni.physx.plugin] Invalid PhysX transform detected for /World/kr210_r3100_ultra/axis_4_kr210_r3100_ultra.
2022-10-16 12:41:01 [81,972ms] [Warning] [omni.physx.plugin] Invalid PhysX transform detected for /World/kr210_r3100_ultra/axis_5_kr210_r3100_ultra.
2022-10-16 12:41:01 [81,972ms] [Warning] [omni.physx.plugin] Invalid PhysX transform detected for /World/kr210_r3100_ultra/axis_6_kr210_r3100_ultra.
Traceback (most recent call last):
  File "cartpole_train.py", line 40, in <module>
    model.learn(total_timesteps=500000)
  File "C:\Omniverse\Library\isaac_sim-2022.1.1\kit\python\lib\site-packages\stable_baselines3\ppo\ppo.py", line 319, in learn
    reset_num_timesteps=reset_num_timesteps,
  File "C:\Omniverse\Library\isaac_sim-2022.1.1\kit\python\lib\site-packages\stable_baselines3\common\on_policy_algorithm.py", line 247, in learn
    continue_training = self.collect_rollouts(self.env, callback, self.rollout_buffer, n_rollout_steps=self.n_steps)
  File "C:\Omniverse\Library\isaac_sim-2022.1.1\kit\python\lib\site-packages\stable_baselines3\common\on_policy_algorithm.py", line 166, in collect_rollouts
    actions, values, log_probs = self.policy(obs_tensor)
  File "C:\Omniverse\Library\isaac_sim-2022.1.1/exts/omni.kit.pip_torch-1_11_0-0.1.3+103.1.wx64.cp37/torch-1-11-0\torch\nn\modules\module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File "C:\Omniverse\Library\isaac_sim-2022.1.1\kit\python\lib\site-packages\stable_baselines3\common\policies.py", line 592, in forward
    distribution = self._get_action_dist_from_latent(latent_pi)
  File "C:\Omniverse\Library\isaac_sim-2022.1.1\kit\python\lib\site-packages\stable_baselines3\common\policies.py", line 607, in _get_action_dist_from_latent
    return self.action_dist.proba_distribution(mean_actions, self.log_std)
  File "C:\Omniverse\Library\isaac_sim-2022.1.1\kit\python\lib\site-packages\stable_baselines3\common\distributions.py", line 153, in proba_distribution
    self.distribution = Normal(mean_actions, action_std)
  File "C:\Omniverse\Library\isaac_sim-2022.1.1/exts/omni.kit.pip_torch-1_11_0-0.1.3+103.1.wx64.cp37/torch-1-11-0\torch\distributions\normal.py", line 50, in __init__
    super(Normal, self).__init__(batch_shape, validate_args=validate_args)
  File "C:\Omniverse\Library\isaac_sim-2022.1.1/exts/omni.kit.pip_torch-1_11_0-0.1.3+103.1.wx64.cp37/torch-1-11-0\torch\distributions\distribution.py", line 56, in __init__       
    f"Expected parameter {param} "
ValueError: Expected parameter loc (Tensor of shape (1, 5)) of distribution Normal(loc: torch.Size([1, 5]), scale: torch.Size([1, 5])) to satisfy the constraint Real(), but found invalid values:
tensor([[nan, nan, nan, nan, nan]], device='cuda:0')
2022-10-16 12:41:01 [82,321ms] [Warning] [carb.audio.context] 1 contexts were leaked
2022-10-16 12:41:01 [82,340ms] [Error] [omni.physx.plugin] USD stage detach not called, holding a loose ptr to a stage!

what is the problem?

Hi there,

This error generally occurs when the observation buffer in your task contains NaNs. It looks like physx is throwing some warnings of Invalid PhysX transform detected, which may have caused incorrect behaviours in the physics simulation. Could you double check to see if your scene setup is correct by running with the viewer first?

@kellyg , I have the same issue with viewer too, it happen randomly during training…

is there anyway to restart the environment?

Were you able to observe anything off in the simulation with the viewer? You can try using the random_policy.py script instead of training so that the NaNs don’t cause crashes when propagated to the training code. Generally, this is a result of bad simulation behaviour potentially caused by incorrect physics parameters, which will cause NaNs in simulation. When you retrieve the physics states as part of your observations, the NaNs are picked up and then propagated into the training code, leading to the error. Please also verify if you do indeed have NaN values in your observation buffer.