I’ve been able to view some USD models with Explorer/Isaac Sim on the Vision Pro by following the steps here: Spatial Streaming for Omniverse Digital Twins Workflow — Spatial Streaming for Omniverse Digital Twins Workflows
Instead of using the Configurator app for Vision Pro, I built my own CloudXR client following the steps from GitHub: GitHub - NVIDIA/cloudxr-framework: Swift frameworks for building client applications that connect to CloudXR servers. These frameworks offer the base connection, streaming and communication systems needed to create a connection.
I can connect to USD Explorer using the CloudXR Runtime (5.0) in VR mode, and it works great.
However, I don’t see a way to move in VR even with hand tracking enabled. In contrast, when I use the visionOS simulator (not the AVP itself), I can move forwards/backwards by doing a pinch gesture with my trackpad.
This means at least some events are passed to the Omniverse server, but I can’t see a method in CloudXR to pass events like that. I’ve been told it should be possible to navigate using the two-handed ping gesture from visionOS.
Does anyone have any experience with that?
You are correct that movement in USD Explorer or Isaac Sim via Vision Pro with spatial streaming generally “just works” in the visionOS simulator using a trackpad pinch gesture, but in real hardware via a CloudXR client, navigation can be quite limited unless special event translation is supported.
Why You Can’t Move in CloudXR with Hand Tracking (Current State)
- No Direct Event Mapping: The CloudXR framework for Vision Pro does not currently forward or emulate the two-handed pinch navigation gesture as movement controls to the Omniverse server when running on real hardware. This is different from the simulator, which translates trackpad and keyboard inputs into navigation commands recognized by the server.
- CloudXR Client API Gaps: As of CloudXR 5.0 and the open-source NVIDIA/cloudxr-framework, there is no explicit API or event handler exposed for mapping custom visionOS gestures (like two-handed pinch) to navigation in Omniverse VR sessions. Instead, CloudXR clients mostly relay pose and input events defined in their base connection spec.
- Vision Pro/visionOS Hand Tracking: Vision Pro supports a range of spatial gestures (single and two-handed pinches for drag, zoom, etc.), but these must be explicitly captured and transmitted from the custom client to the server app as recognized input events (not yet default behavior with third-party CloudXR clients).
Event Forwarding in Practice
- The visionOS simulator fakes navigation (e.g., forward/back on pinch) by mapping trackpad gestures to movement events. On AVP real hardware, those gestures must be handled at the client layer and mapped to a CloudXR controller or keyboard event that the server understands.
- Existing forums and documentation confirm others have encountered the same limitation, with no out-of-the-box fix available at this time.
Hints for Future/Custom Implementation
- If you want to enable such navigation, you’d need to extend your CloudXR client to:
- Recognize the two-handed pinch (or other custom gesture) on Vision Pro via visionOS’s gesture recognizer (in Swift).
- Forward it as a “move forward/back” event emulating a controller input (e.g., VR touchpad or thumbstick axis) to the CloudXR server.
- The Omniverse/Isaac Sim app should then interpret that event as camera movement.
- Some Omniverse and Isaac Sim setups use custom OpenXR or OpenXRDevice-based event routing for alternate input mapping, but that workflow is not included in the standard CloudXR Swift framework yet.
Summary:
Movement/pinch navigation is not currently mapped by default in the Vision Pro CloudXR framework. You’ll need to add gesture-handling Swift code to your client to emit standard navigation inputs (similar to what the simulator does with a keyboard or trackpad), or wait for an official release from NVIDIA that incorporates this event forwarding natively. Until then, moving in VR on AVP with CloudXR will remain limited unless you implement the gesture translation yourself.