Is it possible not to use cuGraphicsEGLRegisterImage for every frame?

I am using this code to map from nvsufracebuffer to opencv GPU mat

    for (int i = 0; i < surface->numFilled; i++) {
        // GST_DEBUG("NvBufSurfaceMapEglImage: %d", i);
        auto registerStart = std::chrono::high_resolution_clock::now();  
       status = cuGraphicsEGLRegisterImage(&resources[i], 
                surface->surfaceList[i].mappedAddr.eglImage, CU_GRAPHICS_MAP_RESOURCE_FLAGS_NONE);
       
        if (status != CUDA_SUCCESS) {
            GST_ERROR("cuGraphicsEGLRegisterImage failed: %d", status); 
            break;  
        }
        registered[i] = true; 

        // map egl image for every call 
        // GST_DEBUG("cuGraphicsResourceGetMappedEglFrame: %d", i);
        status = cuGraphicsResourceGetMappedEglFrame(&frames[i], resources[i], 0, 0);
        if (status != CUDA_SUCCESS) {
            GST_ERROR("cuGraphicsResourceGetMappedEglFrame failed: %d", status); 
            break; 
        }
        auto registerEnd = std::chrono::high_resolution_clock::now();
        auto registerDuration = std::chrono::duration_cast<std::chrono::microseconds>(registerEnd - registerStart);
        GST_INFO("cuGraphicsResourceGetMappedEglFrame took %ld microseconds", registerDuration.count());
 
        int stream_i = stream_idx.empty() ? i : stream_idx[i]; 
        assert(stream_i < surface->batchSize);
        mats[stream_i] = cv::cuda::GpuMat(frames[i].height, frames[i].width, CV_8UC4, 
                                   frames[i].frame.pPitch[0], frames[i].pitch);   
    }

I have 3 4k 30 fps streams and cuGraphicsEGLRegisterImage takes a pretty long time, is there a way to avoid calling cuGraphicsEGLRegisterImage for every frame and doing it only once at the start ?

Hi,

Depends on your use case.

If the frame data uses the same location, you can do that only in the initialization and termination time.
Please give it a try to see if any issues.

Thanks.

Hi,
Thank you for your response.

I have a custom deepstream element, that gets the frames from the nvarguscamera element, I checked there it uses a nv buffer pool, I assume this could be the reason I was seeing a bit of weird behavior, frames updating slower, and sometimes with weird data.

Hi,

Could you share the frame size and the frame updating elapsed time with us?
Does the perf improve without the custom Deepstream element?

Thanks.

Hi

This worked by some modifications of the nvarguscamera element to send ids with buffers from the buffer pool and registering each buffer only once.

My image size is 4032x3040, I had 3 streams at 30 fps,
On average registering each resource of this size was about 3.5ms, saving me about 10 milliseconds for each frame.

Thanks for the help.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.