Accessing GPU buffers in deepstream filter on Jetson nano

Hi! I’m trying to write a transform-in-place deepstream filter for processing NV12 frames on the Jetson nano with Deepstream 5.0.
I would like to access the NV12 frame from within CUDA kernels.

On dGPU this was simple, however on the Jetson I’m having trouble accessing these buffers. I understand I should go though the CUeglFrame structs.

I’ve done the cuGraphicsEGLRegisterImage and cuGraphicsResourceGetMappedEglFrame calls that seem to give sensible results.
However, whenever I then try to pass a GPU pointer from the CUeglFrame (i.e. egl_frame.frame.pPitch[0]) to a cuda kernel and access it, I get “CUDA error: unspecified launch failure”.

Apart from the commented out sections in the sample plugins, I cannot find an actual example of how to access these buffers. Am I missing a call that should happen before accessing the buffers?

Any help would be appreciated!

Jetson Nano 4GB
JetPack 4.5
Deepstream 5.0
Cuda 10.2
TensorRT 7.1.3
NVIDIA driver 32.4.4

Small update for people with a similar issue:
after using the information and code in this post I managed to map the UV12 data to a CUsurfObject that I can read in a CUDA kernel.
Next question would be if there is a way to directly modify the UV12 data in a gstreamer buffer as well?

please refer to opensource nvdspreprocess plugin. it will use CUeglFrame, NvBufSurfTransformAsync to do format conversion with hardware acceleration.

Thank you for the reply!
The NvBufSurfTransformAsync() function only does basic operations like cropping/scaling or format conversion and is closed source. Is there any way to modify the actual pixels of a NvBufSurface directly in a CUDA kernel?

please refer to this topic for “Use NvBufSurface together with a custom cuda kernel”.

Thanks again for the reply.
The topic you referenced works for EGLFrames of type CU_EGL_FRAME_TYPE_PITCH.
However, in my case on the Jetson Nano, I’m receiving frames of type CU_EGL_FRAME_TYPE_ARRAY.
Simply converting the CUeglFrame.frame.pPitch[0] or CUeglFrame.frame.pArray[0] values to a uint8_t * pointer results in a “unspecified launch failure”.

But in the mean time I did find that by putting them in a CUsurfObject (using cuSurfObjectCreate) you can access them using the surf2Dread and surf2Dwrite functions, so this solves my issue.

Thanks for the help.

