EGLStream(CUDA) -> cv::cuda::GpuMat using Argus & nppi

I’m trying to retrieve a cv::cuda::GpuMat from CUDA EGLStream.

My code is based on jetson_multimedia_api/argus/samples/syncSensor

The current setting of my stream is

    iEGLStreamSettings->setPixelFormat(PIXEL_FMT_YCbCr_420_888);
    iEGLStreamSettings->setResolution(STREAM_SIZE);
    iEGLStreamSettings->setEGLDisplay(g_display.get());

I’ve made some modifications on ScopedCudaEGLStreamFrameAcquire::generateHistogram

bool ScopedCudaEGLStreamFrameAcquire::generateHistogram(unsigned int histogramData[HISTOGRAM_BINS],
                                                        float *time)
{
    if (!hasValidFrame() || !histogramData || !time)
        ORIGINATE_ERROR("Invalid state or output parameters");

    unsigned int height = m_frame.height;
    unsigned int width = m_frame.width;

    NppiSize in_size{
        .width = static_cast<int>(width),
        .height = static_cast<int>(height),
    };

    cv::cuda::GpuMat gpuMat;
    gpuMat.create(cv::Size(width, height), CV_8UC4);

    NppStatus status = nppiNV21ToBGR_8u_P2C4R((const Npp8u * const*)m_frame.frame.pPitch[0],
                                              m_frame.pitch,
                                              (Npp8u *)gpuMat.cudaPtr(),
                                              gpuMat.step,
                                              in_size);

    return true;
}

The function terminates successfully, but throws an error on the next call at gpuMat.create()

terminate called after throwing an instance of 'cv::Exception'
  what():  OpenCV(4.6.0) /home/nano1/opencv/modules/core/src/cuda/gpu_mat.cu:116: error: (-217:Gpu API call) unspecified launch failure in function 'allocate'

I haven’t called any OpenCV function outside generateHistogram()

Any help will be appreciated!

I’ve just noticed there’re some OpenCV codes left.

After removing them, the code fails on the second execution with NPP_CUDA_KERNEL_EXECUTION_ERROR

    ...

    r = cuEGLStreamConsumerAcquireFrame(&m_connection, &m_resource, &m_stream, -1);
    if (r == CUDA_SUCCESS)
    {
      printf("Frame acquired succesfully!\n");
      r = cuGraphicsResourceGetMappedEglFrame(&m_frame, m_resource, 0, 0);

      if (r != CUDA_SUCCESS)
      {
        const char* errmsg;

        cuGetErrorString(r, &errmsg);

        printf("cuGraphicsResourceGetMappedEglFrame failed\n");
        printf("%s\n", errmsg);
      }
    }

A small update on my progress so far.

unspecified launch failure is returned by cuGraphicsResourceGetMappedEglFrame().

The ret val of cuGraphicsResourceGetMappedEglFrame() is 719, which is CUDA_ERROR_LAUNCH_FAILED

TL;DR

When nppiNV21ToBGR_8u_P2C4R_Ctx() is called to convert a CUeglFrame into cv::cuda::GpuMat (BGR),
cuEGLStreamConsumerReleaseFrame() fails.

I then tried to gpuMat.download(cpuMat) and cv::imwrite(cpuMat), it throws unspecified launch failure as well.

========LOOP STARTED========
ScopedCudaEGLStreamFrameAcquire::ScopedCudaEGLStreamFrameAcquire | [GOOD] cudaStreamCreate
ScopedCudaEGLStreamFrameAcquire::ScopedCudaEGLStreamFrameAcquire | [GOOD] cuEGLStreamConsumerAcquireFrame!
ScopedCudaEGLStreamFrameAcquire::ScopedCudaEGLStreamFrameAcquire | [GOOD] cuGraphicsResourceGetMappedEglFrame 
ScopedCudaEGLStreamFrameAcquire::ScopedCudaEGLStreamFrameAcquire | [GOOD] m_connection: 0x7f60463b90
ScopedCudaEGLStreamFrameAcquire::ScopedCudaEGLStreamFrameAcquire | [GOOD] m_resource: 0x7f60479420
ScopedCudaEGLStreamFrameAcquire::ScopedCudaEGLStreamFrameAcquire | [GOOD] m_stream: 0x7f60463080
ScopedCudaEGLStreamFrameAcquire::generateHistogram | [GOOD] nppSetStream
ScopedCudaEGLStreamFrameAcquire::generateHistogram | [GOOD] nppGetStreamContext
ScopedCudaEGLStreamFrameAcquire::generateHistogram | [GOOD] cudaStreamSynchronize
========LOOP FINISHED========
ScopedCudaEGLStreamFrameAcquire::~ScopedCudaEGLStreamFrameAcquire | [BAD] cuEGLStreamConsumerReleaseFrame! <- Crashes
ScopedCudaEGLStreamFrameAcquire::~ScopedCudaEGLStreamFrameAcquire | [BAD] m_connection: 0x7f60463b90
ScopedCudaEGLStreamFrameAcquire::~ScopedCudaEGLStreamFrameAcquire | [BAD] m_resource: 0x7f60479420
ScopedCudaEGLStreamFrameAcquire::~ScopedCudaEGLStreamFrameAcquire | [BAD] m_stream: 0x7f60463080
ScopedCudaEGLStreamFrameAcquire::~ScopedCudaEGLStreamFrameAcquire | [BAD] cuEGLStreamConsumerReleaseFrame unspecified launch failure!
========LOOP STARTED========
ScopedCudaEGLStreamFrameAcquire::ScopedCudaEGLStreamFrameAcquire | [BAD] cudaStreamCreate
ScopedCudaEGLStreamFrameAcquire::ScopedCudaEGLStreamFrameAcquire | [BAD] cuGraphicsResourceGetMappedEglFrame!
CONSUMER: No more frames. Cleaning up.
CONSUMER: Done.
PRODUCER: Captures complete, disconnecting producer.
PRODUCER: Done -- exiting.

I’ve attached my code for anyone interested in.

Always, appreciate your help!

syncSensor.zip (11.4 KB)