CUDA crash when using multiple robots with different DoF and use_cuda_graph=True

Isaac Sim Version

4.0.0
4.1.0
4.2.0
4.5.0
2023.1.1
2023.1.0-hotfix.1
Other (please specify):

Operating System

Ubuntu 22.04
Ubuntu 20.04
Windows 11
Windows 10
Other (please specify):

GPU Information

  • Model: NVIDIA RTX 4090
  • Driver Version: 570.133.20

Topic Description

Detailed Description

Using cuRobo inside an Omniverse Isaac Sim setup via a custom Kit extension. Each robot has its own behavior script with a dedicated MotionGen instance initialized using use_cuda_graph=True. The simulation includes robots with different degrees of freedom (e.g., 6-DoF and 4-DoF).

Everything runs fine during the first simulation execution. However, on replay (i.e., resetting the timeline and starting again), I consistently hit a crash with cudaErrorIllegalInstruction during motion planning.

The issue only occurs when multiple robots with different DoF are present and use_cuda_graph=True. Disabling use_cuda_graph avoids the crash.

According to cuRobo docs, this may be expected behavior:

Once a CUDA graph is generated, we cannot change the dimensions of any of the tensors (e.g., DoF, timesteps, number of seeds). While cuRobo attempts to regenerate the CUDA graph on dimension change, PyTorch crashes when CUDAGraph.reset() is called.

Looking for a workaround or a supported way to safely use CUDA graphs in replayable simulations involving robots with different DoF.


Steps to Reproduce

  1. Create a Kit extension using cuRobo with use_cuda_graph=True.
  2. Add multiple robots with different DoF (e.g., one 6-DoF and one 4-DoF).
  3. Initialize a separate MotionGen per robot in their respective behavior scripts.
  4. Run the simulation — the first run works fine.
  5. Reset the simulation timeline and run again — crash occurs.

Error Messages

Python traceback (excerpt):

python-repl

CopyEdit

RuntimeError: CUDA error: an illegal instruction was encountered  
CUDA kernel errors might be asynchronously reported...  
[omni.kit.scripting.scripts.utils] Python Scripting Error:  
...  
[carb.cudainterop.plugin] CUDA error 715: cudaErrorIllegalInstruction  
...  
[omni.physx.plugin] PhysX ABORT error: PhysX cannot start GPU simulation because of previous CUDA errors! Error code 715!  
...  
[omni.kit.notification_manager.manager] PhysX has reported too many errors, simulation has been stopped.  

Screenshots or Videos

(Not applicable)


Additional Information

What I’ve Tried

  • Disabled use_cuda_graph → simulation works fine even after replay.
  • Verified crash only happens when robots with different DoF are present.
  • Reviewed cuRobo documentation regarding limitations of CUDA graphs.

Related Issues

  • cuRobo documentation (known issue) notes the limitation of reusing CUDA graphs with different tensor shapes.

Additional Context

  • cuRobo version: nvidia_curobo-0.7.6
  • Torch version: 2.1.0+cu118
  • NumPy: 1.26.0
  • Isaac Sim: 4.0.0
  • cuRobo installation mode: Python

Looking for any stable workaround or best practice when using use_cuda_graph=True with mixed-DoF robots that need to rerun simulations.

Curobo is compatible with Isaac Sim 4.5.0. Please upgrade to the latest release.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.