Hi @all500234765! Alexey Panteleev and I ran into a similar issues while working on an API interop. There were two hard-to-find things we ran into that changed how memory layouts work.
- Probably the most likely one you’re running into – in Vulkan, if the image wasn’t created with a dedicated allocation (i.e. using VK_KHR_dedicated_allocation’s
VkMemoryDedicatedAllocateInfoKHR
in theVkMemoryAllocateInfo
’spNext chain
, then the CUDA external memory handle must not use thecudaExternalMemoryDedicated
flag. I’d try removing thecuExtmemHandleDesc.flags = cudaExternalMemoryDedicated;
in the CUDA-Vulkan path and see if that fixes things.
The artifacts from this one usually look like glitchy vertical stripes, which sort of match what’s going on here:
The underlying reason there’s this requirement is because if the driver knows that an image uses a dedicated allocation, then it knows there’s only one image in an allocation and that that image has offset 0, which allows it to do different optimizations, including a different image layout in memory. (Thanks to Vivek Kini for this info).
- (Including this one for completeness; the code sample above avoids it, but it might be useful to someone else who’s reading this since I ran into this one.) The two APIs must agree on the depth of the image – in particular, one must be careful to use a depth of 0 (instead of 1) for a 2D CUDA image. If it’s 1, then that’s a 3D width x height x 1 texture, and may use a different layout (and will produce incorrect results if accessed using
surf2D()
.
The artifacts for this one usually have some “holes” in a periodic pattern at some resolutions:
Hope this helps!