Possible driver bug with CUDA/EGLStreams interop


I’m trying to get some code that uses CUDA and EGLStreams working properly, but I’ve come across an odd issue when I start returning/releasing frames from the EGL side.

10392.347206711 [26646-26663] freeSurface       Releasing EGLImage: 0x7f8bb1099531 (cuArray[0] = 0x7f8bb11fb3a0, cuArray[1] = 0x7f8bb11fb4c0)

Here we call eglStreamReleaseImageNV on the EGLImage to return it to CUDA, and then call eglDestroyImage on that same image.

~40 milliseconds later we need to allocate a new frame and we start by emptying the EGLStream from the CUDA side, by calling cuEGLStreamProducerReturnFrame until it fails.

One of the frames returned match the above, at which point we destroy the arrays that make up that image via cuArrayDestroy.

10392.386737477 [26646-26712] allocateSurface   got returned frame
10392.386739367 [26646-26712] allocateSurface   Cleaning up CUDA array 0x7f8bb11fb700
10392.386801017 [26646-26712] allocateSurface   Cleaning up CUDA array 0x7f8bb11fba60
10392.386811417 [26646-26712] allocateSurface   got returned frame
10392.386814337 [26646-26712] allocateSurface   Cleaning up CUDA array *0x7f8bb11fb3a0*
10392.386823448 [26646-26712] allocateSurface   Cleaning up CUDA array *0x7f8bb11fb4c0*
10392.386832498 [26646-26712] allocateSurface   got returned frame
10392.386834348 [26646-26712] allocateSurface   Cleaning up CUDA array 0x7f8bb0f8f280
10392.386840698 [26646-26712] allocateSurface   Cleaning up CUDA array 0x7f8bb0f8f3a0
10392.386846718 [26646-26712] allocateSurface   got returned frame
10392.386849538 [26646-26712] allocateSurface   Cleaning up CUDA array 0x7f8bb0edca60
10392.386856128 [26646-26712] allocateSurface   Cleaning up CUDA array 0x7f8bb0edc820

So at this point, there should be no reference to the EGLImage or the cuArrays.

Then 8 seconds later we need to allocate another frame, and this is where things go wrong:

10400.503672104 [26646-26731] allocateSurface   Presenting frame 6 352x224 (cuArray[0] = 0x7f8bb11fb3a0, cuArray[1] = 0x7f8bb11fb4c0)
10400.503683674 [26646-26731] allocateSurface   Acquired image from EGLStream: 0x7f8bb1099531
10400.503691014 [26646-26731] debug             [EGL] eglExportDMABUFImageQueryMESA: EGL_BAD_PARAMETER error: In eglExportDMABUFImageQueryMESA: Invalid EGLImage (0x7f8bb1099531)

We create two new cuArrays and by some coincidence (or more likely CUDA internal caching) we get back two of the arrays we destroyed earlier. However, when we present it (with cuEGLStreamProducerPresentFrame), we don’t see the expected sequence of events on the EGL side. We expect to see an EGL_STREAM_IMAGE_ADD_NV event and then an EGL_STREAM_IMAGE_AVAILABLE_NV event (since we’ve just created a new pair of arrays, a new EGLImage should be allocated to deal with that).

What we see is just an EGL_STREAM_IMAGE_AVAILABLE_NV event, with the previous EGLImage that was attached to those buffers (There should be a line before the ‘Acquired…’ one saying it was added). The eglExportDMABUFImageQueryMESA that happens just after fails completely saying the EGLImage we just got isn’t valid.

I’m guessing that the problem I’m having is I’m not removing the mapping/image from EGL properly, however I can’t spot any documentation on how to trigger the EGL_STREAM_IMAGE_REMOVE_NV event and there are no obvious functions that will do it either. Is there a way to trigger this event to occur?

Thanks & Regards

Anyone? It seems like a pretty big oversight not to be able to remove images from the stream, esp. when the EGL API allows for it.

You might get better help posting on one of the Jetson forums. If you do, you may be asked for a short reproducer code.

eglExportDMABUFImageMESA() would be the next logical step in your program but this doesn’t work either FWICT, even though it is advertised in extensions list.