Issues Controlling Articulation Joints via Python — GPU Physics Crash & Joint Velocity Limit

Operating System:
Windows
Linux
Kit Version:
110 (Kit App Template)
109 (Kit App Template)
108 (Kit App Template)
107 (Kit App Template)
106 (Kit App Template)
105 (Launcher)
Kit Template:
USD Composer
USD Explorer
USD Viewer
Custom also seen in Base-Editor
GPU Hardware:
A series (Blackwell)
A series (ADA)
A series
50 series
40 series
30 series
GPU Driver:
Latest
Recommended (573.xx)
Other 570.xxx
Work Flow:

I’m trying to control articulation joints and object motion using a Python script inside USD Composer (also happens in Basic-Editor). I created a mechanical structure with an articulation setup, where each link is connected by prismatic joints that move along the X and Z axes. One special aspect is that these prismatic joints are embedded inside cylinder/slider components, meaning the slider moves linearly relative to the rail.

I then attached a Python script to this mechanism. By reading and writing the joint state (position and velocity), the script automatically drives the mechanism’s motion. For example:
jointStateAPI = PhysxSchema.JointStateAPI.Apply(revoluteJoint.GetPrim(), UsdPhysics.Tokens.angular)
jointStateAPI.CreatePositionAttr(45.0)
jointStateAPI.CreateVelocityAttr(180.0)

jointPosition = jointStateAPI.GetPositionAttr().Get()
jointVelocity = jointStateAPI.GetVelocityAttr().Get()

Main Issue:

When I run the simulation using GPU physics in the physicsScene (or even when I don’t explicitly create a physicsScene and let the simulation auto‑create one), I get continuous error messages. Eventually Hydra and RTX rendering crash due to illegal CUDA memory access.

I’m trying to understand what might cause this. Is this a known issue with articulation joints, prismatic joints inside nested geometry, or joint state manipulation on the GPU?

Secondary Issue

When I switch to CPU physics, the simulation no longer crashes. However, I notice that modifying the joint state values does not allow me to achieve higher velocities. The actual joint speed seems capped at around ±100, even though:

  • My joint drives have no limits set

  • The physicsScene has maximum solver iterations

  • I’m directly writing velocity values into the joint state

Why is the velocity capped? Is there an internal PhysX limit, or does articulation joint velocity behave differently from what the API suggests?

Current Workaround

The only way I can currently achieve higher joint speeds is by:

  • Using a pure velocity drive with a high damping value and high target velocity, or

  • Applying a large force to the slider

Both methods work, but they defeat the purpose of directly controlling joint state via Python.
Reproduction Steps:
Error Code:

2026-04-07T07:10:10Z [344,633ms] [Warning] [omni.hydra] Rendering failed. 2026-04-07T07:10:10Z [344,633ms] [Error] [omni.usd] HydraEngine::render failed to end the compute graph: error code 6
2026-04-07T07:10:10Z [344,695ms] [Error] [carb.cudainterop.plugin] CUDA error 700: cudaErrorIllegalAddress - an illegal memory access was encountered)
2026-04-07T07:10:10Z [344,695ms] [Error] [carb.cudainterop.plugin] Failed to import external memory in CUDA
2026-04-07T07:10:10Z [344,695ms] [Error] [gpu.foundation.plugin] Cannot create cuda external memory for resource!
2026-04-07T07:10:10Z [344,696ms] [Error] [gpu.foundation.plugin] Texture creation failed for the device: 0.
2026-04-07T07:10:10Z [344,696ms] [Error] [carb.scenerenderer-rtx.plugin] Failed to allocate 1336x795 LdrColor resource for device mask 0x1
2026-04-07T07:10:10Z [344,696ms] [Warning] [omni.hydra] Rendering failed. 2026-04-07T07:10:10Z [344,696ms] [Error] [omni.usd] HydraEngine::render failed to end the compute graph: error code 6

and it repeats…

Hi and thanks for posting. What are your exact GPU and drivers? You said its a 30 series. Have you tried changing drivers? I have never seen anything regarding “Illegal CUDU memory access”. CUDA is part of the core driver, so if the driver is not a good match, CUDA could be affected. It may have nothing to do with Kit.

To solve this, you are going to have to break down the workflow and isolate the issue.

  1. Set up the objects and joints in the scene
  2. Without code attached, just hit spacebar to play the scene… does it crash or create messages?
  3. Try a much more simple joint, that is not embedded or restrained as much. Maybe the joint is set up in a way that is not physically allowed.
  4. Try using code to initiate the scene and to set initial velocities, but then let it “run” and do not try to write additional states. Read yes, but not write. You maybe be trying to write velocities against the actual physics engine that are “illegal”.

Here is much detailed breakdown of the situation from our AI. Please read through this and see if any of this helps. It probably comes down to a workflow problem where you are trying to force the GPU PhysX to do something it cannot do. So, as stated, you need to reverse engineer the problem, with a simpler and simpler scene until you find what is breaking it.

The GPU crash is very likely a PhysX GPU‑physics bug or instability triggered by directly mutating articulated joint state every frame, while the velocity “cap” you see on CPU comes from PhysX’s joint‑velocity limiting (maxJointVelocity / maxActuatorVelocity), not from the JointStateAPI itself. docs.omniverse.nvidia

1. About the GPU crash (cudaErrorIllegalAddress)

  • cudaErrorIllegalAddress means some CUDA kernel accessed invalid memory; in Omniverse this usually comes from low‑level PhysX or rendering code, and then Hydra/RTX fail to allocate textures or complete the compute graph. forums.developer.nvidia
  • The OmniPhysics docs say JointStateAPI is intended to set initial position and velocity of articulation joints; it does not explicitly guarantee that you can arbitrarily drive joints by rewriting those attributes every simulation step on GPU. docs.omniverse.nvidia

In practice, there are a few patterns that are known to cause GPU physics instability or crashes:

  • Highly stiff/high‑velocity articulated chains, especially with prismatic joints and complex nested geometry, can produce extreme impulses that destabilize GPU PhysX even when CPU solves them. docs.omniverse.nvidia
  • Mutating articulation joint state in ways that disagree with the solver (for example, large instantaneous jumps in joint velocity or position each frame) can create inconsistent internal state on GPU; CPU is more tolerant but can still clamp values. docs.omniverse.nvidia

Because CPU physics runs without crashing on the same scene, the most likely interpretation is:

  • Your articulation definition is formally valid, but the combination of:
    • multiple prismatic joints,
    • nested slider/rail geometry, and
    • direct per‑frame JointStateAPI writes on GPU
      is hitting an edge‑case / bug in the PhysX GPU backend, causing illegal memory access. forums.developer.nvidia

This is consistent with other reports where custom articulated environments or tensor‑based control cause CUDA illegal‑address errors from omni.physx plugins while CPU runs fine. github

What you can try to narrow it down

  • Run the same scripted control with GPU physics disabled but keep RTX/Hydra on; if the crash disappears, it confirms that the problem is in GPU PhysX, not rendering. forums.developer.nvidia
  • Reduce the structure to a minimal reproduction: one or two prismatic joints, simple boxes instead of nested cylinders, and the same JointStateAPI writes; if that works, re‑introduce complexity stepwise to find the triggering pattern. docs.omniverse.nvidia

Given how specific your setup is, filing a bug with your USD and logs, plus a minimal USD+script that reproduces the GPU crash, is the only way NVIDIA can fix the illegal‑memory‑access issue cleanly. forums.developer.nvidia

2. Why velocities are “capped” on CPU

OmniPhysics/PhysX introduces two separate velocity limits for articulations:

  • PhysxJointAPI.maxJointVelocity: “Maximum joint velocity. Only applies to joints that are part of an articulation. The solver will apply appropriate joint‑space impulses in order to enforce the per‑axis joint‑velocity limit.” docs.omniverse.nvidia
  • PhysxArticulationDriveAPI.maxActuatorVelocity: “maximum achievable velocity for actuated joints,” with speedEffortGradient defining how this limit decreases as joint force/torque increases. docs.omniverse.nvidia

The articulation stability guide explicitly recommends using max joint velocity limits (for example 200–300 deg/s for revolute joints) to avoid instability, and notes that excessive velocities can destabilize simulations. docs.omniverse.nvidia

Putting this together:

  • Even if you write high values into the JointStateAPI velocity attribute, the solver still enforces maxJointVelocity on articulated joints and clamps them to the configured limit. docs.omniverse.nvidia
  • If maxJointVelocity and/or maxActuatorVelocity are at their defaults (which are conservative for stability), you will see an apparent cap (for example around 100 units/s) regardless of what you write directly to the joint state. docs.omniverse.nvidia

That is why:

  • Using a drive with high target velocity and sufficient maxForce can go faster: here you are working within the actuator model, and if maxActuatorVelocity is large enough the limiter is higher. docs.omniverse.nvidia
  • Applying a large force works because velocity then emerges from dynamics, again subject to those limits, but possibly reaching higher sustained speeds than your earlier cap. docs.omniverse.nvidia

To verify and adjust this in your scene:

  1. Inspect each articulation joint’s PhysxJointAPI:maxJointVelocity attribute; if unset, explicitly set it to a larger value in units consistent with your joint (distance/s for prismatic, degrees/s for revolute). docs.omniverse.nvidia
  2. For driven joints, check PhysxArticulationDriveAPI:maxActuatorVelocity, maxForce, speedEffortGradient, and velocityDependentResistance; low maxActuatorVelocity or a high gradient can effectively clamp speed. docs.omniverse.nvidia
  3. Re‑run the simulation and read back the actual JointStateAPI velocity; you should see higher values once limits are increased, though you may also need to reduce step size or adjust solver settings to keep things stable. docs.omniverse.nvidia

3. Recommended control pattern

Based on the OmniPhysics docs and stability guide, a more “supported” approach than rewriting joint state every frame on GPU is:

  • Use JointStateAPI to set initial positions/velocities and to monitor state. docs.omniverse.nvidia
  • Use drives (position or velocity) with carefully chosen maxActuatorVelocity, maxForce, damping, and stiffness to achieve your desired motion profile, and use Python to update targets rather than raw state. docs.omniverse.nvidia

This aligns better with how the articulation solver is designed, respects internal limits, and is less likely to trigger GPU backend bugs.

Would you be able to share (even conceptually) how often you are writing joint positions/velocities (every frame, every few frames, or event‑based)? That cadence is important for judging how risky your current control pattern is and what a safer alternative would look like.

Thank you very much for your detailed response — several of the points you mentioned match my own suspicions as well.

Below is some additional information about my setup. These are the GPU, driver, and Kit versions I am currently using. I have not yet tested newer driver versions.
GPU: NVIDIA GeForce RTX 3070 (8GB, in HUD it hasn’t fully used), Driver: Driver Version: 570.211.01 CUDA Version: 12.8, kit-app-template: release 109

During my initial validation phase, I tested only partial assemblies of my mechanism. When running without any Python script, I was able to adjust the joint state position and velocity directly from the Property panel (or through my custom UI extension using jointStateAPI.CreateVelocityAttr()) and achieve high motion speeds without any issues. This worked even when the sub‑layer scene did not contain a physicsScene at all.

The first time I encountered the crash was after I assembled all mechanical modules into the full scene. At that point, when running the simulation and interacting with objects using the mouse, the crash would occur at a specific moment during simulation (the exact time varies depending on the scene, but is consistent for the same scene). This also happens when the Python control script is loaded. Because of this, I suspect the crash is related to modifying joint state position and velocity directly or through the script. My script writes the desired velocity every frame inside on_update() using jointStateAPI.CreateVelocityAttr() after some conditional checks.

Since this issue appears only in scenes with a more complex articulation structure, I also suspect it may be related to how I constructed the articulation hierarchy. Each assembly module contains a fixed_Link as the articulation root. Between links, I use separate cylinder or slider components as joints (and I also define articulation roots inside these components). I treat these articulation joints as floating joints (so that it can move alone with dynamical links and also when it stand alone as a singal component) and connect them to the previous and next link using fixed joints. I realize this structure is quite complicated and may not fully follow articulation best practices. My intention was to create modular, interchangeable joint units between links.

For each those joint component, I added a prismatic joint, a joint drive, and a joint state (for reading and writing). The parameters are as follows:

  • Maximum Joint Velocity (physxJoint:maxJointVelocity):
    1000000.0 distance/s (linear joint: mm/s).
    This might explain the velocity limitation I observed in CPU physics, but I’m still unsure why the actual joint state velocity is limited at around ±100 (same distance/second units, and my scene units are also mm).

  • Joint Drive settings:

    • Max Force (drive:linear:physics:maxForce): Not Limited

    • Max Actuator Velocity (physx:DrivePerformanceEnvelope:linear:maxActuatorVelocity): inf

    • Speed‑Effort‑Gradient: 0.0

    • Velocity Dependent Resistance: 0.0
      All of these are default values and are not modified by my script.

In my later works, I switched to CPU physics and used the joint drive’s target velocity combined with a large damping value to achieve the speeds I needed.

I will need some time to gradually break down my scene and scripts to identify the exact cause. It’s worth mentioning that I reproduced the same issue on another Windows laptop running USD Composer (different GPU 3050Ti and driver), and I also reproduced it in a fresh builded Basic‑Editor using the same physics extensions. This makes me think the issue is not related to hardware or firmware.

Ok, just so you know a GeForce 3070 with 8GB vram is barely our minimal specs to run anything Omniverse or kit. You literally may be running out of memory as well. You really need minimum of 12GB, really 16GB.

Thank you for the suggestions. At the moment, I’m not able to upgrade to a higher‑end GPU or larger VRAM. For my current project, the hardware I’m using generally meets the requirements, and I haven’t encountered any major issues in previous work. So for now, I will continue using this configuration.

I also have some new findings regarding the velocity limitation in CPU mode. I reviewed all articulation‑related parameters, the PhysicsScene settings, and tested multiple assembly variations without using any scripts. Based on these tests, the velocity cap does not appear to be related to:

  • Joint drive parameters:
    PhysxArticulationDriveAPI.maxActuatorVelocity,
    speedEffortGradient,
    velocityDependentResistance,
    maxForce

  • PhysicsScene settings:
    SolverType,
    Velocity Iteration Count,
    Time Steps,
    physxScene:solveArticulationContactLast

The only parameter that seems to affect the velocity limit is:

PhysxJointAPI:maxJointVelocity

Here is the interesting part:

  • If I change maxJointVelocity during simulation, even to a very small value like 0.1, then I am able to set and read much higher velocities through JointStateAPI.

  • However, if I modify maxJointVelocity before starting the simulation, the ±100 velocity limit remains unchanged.

This behavior is consistent across different assemblies and scenes.

I will continue investigating, but this seems to be a key factor in the velocity limitation I observed.

Ok let me refer this to the engineers. Thank you for your findings. Hopefully you have found a way to work with it.

I accidentally enabled the “My USD Viewer Setup Extension”, and as a result the toolbar and menu bar in USD Composer disappeared. This left me unable to perform any system operations, open the Extension window, or access any other UI elements.

I attempted to rebuild a fresh USD Composer application from the Kit App Template, but the issue persisted. This forced me to re‑clone the GitHub repository (version 110.1.0). Due to version differences, I found that the latest Kit App Template version of USD Composer can no longer enable some of the physics and other extensions I rely on through the Extension Registry. Because of this, I switched to using Basic Editor to complete my work.

During use, I encountered RTX rendering crashes again. The error looks similar to what I’ve seen before, but this time the cause appears to be an incorrect looping process that gradually consumed all GPU memory. What’s more concerning is that the crash occurred even when I wasn’t performing any actions.

{
    debugName="CopyDepth output",
}
2026-04-29T10:30:10Z [4,411,483ms] [Error] [omni.rtx] VkResult: ERROR_OUT_OF_DEVICE_MEMORY
2026-04-29T10:30:10Z [4,411,483ms] [Error] [omni.rtx] vkAllocateMemory failed.
2026-04-29T10:30:10Z [4,411,483ms] [Error] [omni.rtx] Texture creation failed for the device: 0.
2026-04-29T10:30:10Z [4,411,483ms] [Warning] [rtx.hydra] Rendering failed.
2026-04-29T10:30:10Z [4,411,483ms] [Error] [omni.usd] HydraEngine::render failed to end the compute graph: error code 6
2026-04-29T10:30:47Z [4,447,821ms] [Error] [omni.rtx] Out of GPU memory allocating resource 'CopyDepth output' [size: unknown]
2026-04-29T10:30:47Z [4,447,822ms] [Error] [omni.rtx] Failure injector rule to repro:
{
    debugName="CopyDepth output",
}
2026-04-29T10:30:47Z [4,447,822ms] [Error] [omni.rtx] VkResult: ERROR_OUT_OF_DEVICE_MEMORY
2026-04-29T10:30:47Z [4,447,822ms] [Error] [omni.rtx] vkAllocateMemory failed.
2026-04-29T10:30:47Z [4,447,822ms] [Error] [omni.rtx] Texture creation failed for the device: 0.
2026-04-29T10:30:47Z [4,447,822ms] [Warning] [rtx.hydra] Rendering failed.
2026-04-29T10:30:47Z [4,447,822ms] [Warning] [omni.usd] HydraEngine::render failure: expected 1 results, but have 0

2026-04-29T10:30:47Z [4,447,822ms] [Error] [omni.usd] HydraEngine::render failed to end the compute graph: error code 6
2026-04-29T10:30:48Z [4,448,737ms] [Error] [omni.rtx] Out of GPU memory allocating resource 'CopyDepth output' [size: unknown]
2026-04-29T10:30:48Z [4,448,737ms] [Error] [omni.rtx] Failure injector rule to repro:
{
    debugName="CopyDepth output",
}
2026-04-29T10:30:48Z [4,448,738ms] [Error] [omni.rtx] VkResult: ERROR_OUT_OF_DEVICE_MEMORY
2026-04-29T10:30:48Z [4,448,738ms] [Error] [omni.rtx] vkAllocateMemory failed.
2026-04-29T10:30:48Z [4,448,738ms] [Error] [omni.rtx] Texture creation failed for the device: 0.
2026-04-29T10:30:48Z [4,448,738ms] [Warning] [rtx.hydra] Rendering failed.
2026-04-29T10:30:48Z [4,448,738ms] [Error] [omni.usd] HydraEngine::render failed to end the compute graph: error code 6
2026-04-29T10:30:48Z [4,449,641ms] [Error] [omni.rtx] Out of GPU memory allocating resource 'CopyDepth output' [size: unknown]
2026-04-29T10:30:48Z [4,449,641ms] [Error] [omni.rtx] Failure injector rule to repro:
{
    debugName="CopyDepth output",
}

I’ve attached the log information. From the logs, you can see that the errors started very early and kept repeating until the GPU memory was fully exhausted, eventually causing the crash.

In addition, I noticed that the new rendering mode RTX‑Minimal (the one besides Path Tracing and Real‑Time 2.0) is also unusable. Attempting to switch to it causes the entire application to crash immediately.

kit_20260429_111639.log (1.4 MB)

kit_20260429_124804.log (1.1 MB)

Hi there. If you are saying that there is a bug with the very latest version of USD Composer, using kit 110.1, then please completely nuke the whole kit install and start over. The best way to do this is to first delete the whole “kit app template” repo folder. Then go into /users/USER/appdata/local and delete the whole OV folder.

Then start over again. If the bug persists, then you can always clone from an older repo by downgrading the repo choice back to 110.0 or even 109, the same version you were on. Please try these steps and let me know.

Hello, I completely removed the entire kit app template folder and its shared ov folder, and cleared all associated cache files. After rebuilding the kit app template repositories for both versions 109 and 110, everything is now working without any issues.

Great to hear !!