Long shot... access mesh data in different program, but already loaded in the GPU

I am quite sure the answer is “No way”, but I am still wondering if I could somehow accomplish this.

We run a Unity application that loads a bunch of meshes. On these meshes we run calculations with Optix.

Right now this is done by walking through the Unity scene and grabbing each Mesh, converting the mesh data to a format that works for Optix and then send it there.

So at this point the GPU has loaded the same meshes twice, once for display in our app and once for Optix.

Is there a way to somehow reuse the meshes we already loaded? Even if Optix couldn’t directly access that it may be faster to download them from the GPU to re use instead of getting them from Unity.

Define “in a different program”. You mean inside a different process?

Accessing other graphics API’s (OpenGL, Vulkan, DX) resources on the device in OptiX directly will always require CUDA interoperability.
This is possible inside the same process for a subset of types of resources and data types supported by CUDA interop. (E.g. images and textures need to be 1-, 2- or 4-component formats, but not compressed or exotic bit layouts.)

Always keep in mind that CUDA vector types have a specific alignment requirement!
For example if you have tightly interleaved vertex data like struct { float3 v; float2 t; }; in your graphics API, you cannot simply map that to CUDA because the float2 requires an 8-byte alignment but is at offset 12 => crash with misaligned access error.
(You could reinterpret that as individual floats to get 4-byte alignment, which is then is slower to load then aligned float2).

Doing CUDA data access across process boundaries requires Inter Process Communication (IPC).
https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#interprocess-communication

According to the memory management chapter “IPC functionality is restricted to devices with support for unified addressing on Linux and Windows operating systems. IPC functionality on Windows is restricted to GPUs in TCC mode”.
https://docs.nvidia.com/cuda/cuda-driver-api/group__CUDA__MEM.html#group__CUDA__MEM

I would also assume the IPC memory handles require actual CUDA memory allocations and not just temporarily mapped virtual pointers from CUDA-graphics interop resources.
(I have no experience with CUDA IPC. I’m a Windows graphics guy.)

In summary, CUDA interop with graphics APIs in the same process shouldn’t be a problem, but I don’t expect that to be working across processes using IPC easily, and not at all under Windows devices running graphics.

Thanks!

I was thinking that maybe from Unity I could a handle on things via CUDA, or at least the in DLL (which uses Optix) I am importing, which should run inside the same process. Unity is using DirectX 11, if that makes a difference.

If you’re loading a DLL into the same process, then you should be able to use CUDA interop on buffer and image resources, but as said, be very careful with the CUDA alignment restrictions.

Once you have a CUDA resource handle, you can access the underlying device data.
I’m showing that for OpenGL in my OptiX 7 examples. Search for m_interop in this file for example:
https://github.com/NVIDIA/OptiX_Apps/blob/master/apps/rtigo3/src/DeviceSingleGPU.cpp

There was very similar discussion recently which effectively does the same with D3D11 and boils down to another CUDA header and the resp. D3D11 variants of the resource registering calls, the rest behaves identical:
https://forums.developer.nvidia.com/t/unity3d-rendertexture-texture2d-to-optiximage2d/156408

Awesome, thanks a lot! I’ll have a look