Send OpenCV GpuMat to GStreamer pipeline without memory copy?

smarttowel0 · December 26, 2017, 2:40pm

I have gstreamer pipeline with appsrc and omx encoder in my app. And I have GpuMat after some OpenCV processing.
After reading some topics I found solution with NvBuffer and mmap, but mmap working with CPU memory. So, I tried next steps:

Create NvBuffer
Get fd and call mmap(…)
Create Mat with pointer from mmap(…)
GpuMat.download(Mat)
gst_buffer_new_wrapped_full and some magic with inmem->allocator->mem_type = “nvcam”

It’s working fine, but it still need to copy memory from GPU to CPU.

Also, I playing with NvEGLImageFromFd and mapEGLImage2Float, but cuGraphicsEGLRegisterImage crashed with error code 999. In general, I’m not sure that it can solve my problem, because documentation very poor.

What is the best way to send GpuMat to gstreamer?
Thanks.

Honey_Patouceul · December 26, 2017, 5:52pm

This may not be straight forward for your case, but you may have a look to what is used in nvivafilter plugin.
Some info may be found here: [url]gstreamer NVMM <-> opencv gpuMat - Jetson TX2 - NVIDIA Developer Forums

DaneLLL · December 27, 2017, 2:54am

Hi smarttowel0,
For gstreamer, please try @Honey_Patouceul’s suggestion.

You may also try NvVideoEncoder in MM APIs.

smarttowel0 · December 28, 2017, 1:01pm

I see code from @Honey_Patouceul’s link and try to write converter from GpuMat to GstBuffer:

GstBuffer *DmaBuffer::toGstBuffer(const cv::cuda::GpuMat &mat)
{
    EGLImageKHR image = NvEGLImageFromFd(m_eglDisplay, m_fd);
    CUresult status;
    CUeglFrame eglFrame;
    CUgraphicsResource pResource = NULL;
    
    cudaFree(0);
    
    status = cuGraphicsEGLRegisterImage(&pResource, image, CU_GRAPHICS_MAP_RESOURCE_FLAGS_NONE);
    if (status != CUDA_SUCCESS) {
        printf("cuGraphicsEGLRegisterImage failed : %d \n", status);
        return 0;
    }
    
    status = cuGraphicsResourceGetMappedEglFrame(&eglFrame, pResource, 0, 0);
    if (status != CUDA_SUCCESS) {
        printf ("cuGraphicsSubResourceGetMappedArray failed\n");
    }
    
    status = cuCtxSynchronize();
    if (status != CUDA_SUCCESS) {
        printf ("cuCtxSynchronize failed \n");
    }
    
    if (eglFrame.frameType == CU_EGL_FRAME_TYPE_PITCH) {
        if (eglFrame.eglColorFormat == CU_EGL_COLOR_FORMAT_RGBA) {
            cv::cuda::GpuMat mapped(cv::Size(eglFrame.width, eglFrame.height), CV_8UC4,
                                    eglFrame.frame.pPitch[0]);
            mat.copyTo(mapped);
        } else {
            printf ("Invalid eglcolorformat for opencv\n");
        }
    }
    else {
        printf ("Invalid frame type for opencv\n");
    }
    
    status = cuCtxSynchronize();
    if (status != CUDA_SUCCESS) {
        printf ("cuCtxSynchronize failed after memcpy \n");
    }
    
    status = cuGraphicsUnregisterResource(pResource);
    if (status != CUDA_SUCCESS) {
        printf("cuGraphicsEGLUnRegisterResource failed: %d \n", status);
    }
    
    GstBuffer *buffer = gst_buffer_new_wrapped_full(GstMemoryFlags(0),
                                                    m_params.nv_buffer,
                                                    m_params.nv_buffer_size, 0,
                                                    m_params.nv_buffer_size,
                                                    NULL, NULL);
    
    GstMemory *inmem = gst_buffer_peek_memory(buffer, 0);
    inmem->allocator->mem_type = "nvcam";
    
    NvDestroyEGLImage(m_eglDisplay, image);
    
    return buffer;
}

I call this method from appsrc “need-data” signal and it stuck at cuGraphicsEGLRegisterImage call sometimes. It looks like thread synchronization issue, but I can’t understand what I doing wrong. Source code of nvivafilter can help me, but it’s closed source plugin:(

DaneLLL · January 4, 2018, 8:50am

Hi smarttowel0,
Please share your code to use for reproducing the issue. And are you on r28.1?

smarttowel0 · January 8, 2018, 3:48pm

Yes, I have Jetson TX2 with L4T r28.1
Sorry, I can’t provide full code for reproducing this issue. But I use code from my previous post on each “need-data” signal.
In general, my program have 3 separate threads:

Capture h264 rtsp stream with nvxio
Do some processing with GpuMat
Stream GpuMat over network with rtsp gstreamer (appsrc → nvvidconv → omxh264enc → rtph264pay)

This problem is manifested in two ways:

On first frames my video stream works fine, after that I see that stream drop frames, finally I stuck at cuGraphicsEGLRegisterImage call.
Instead of first way, my program can crash with message:

pthread_mutex_lock.c:349: __pthread_mutex_lock_full: Assertion `INTERNAL_SYSCALL_ERRNO (e, __err) != EDEADLK || (kind != PTHREAD_MUTEX_ERRORCHECK_NP && kind != PTHREAD_MUTEX_RECURSIVE_NP)' failed.

Backtrace:

#0  0x0000007fb4933528 in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:54
#1  0x0000007fb49349e0 in __GI_abort () at abort.c:89
#2  0x0000007fb492cc04 in __assert_fail_base (fmt=0x7fb4a19240 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", 
    assertion=assertion@entry=0x7fb7e6b9f8 "INTERNAL_SYSCALL_ERRNO (e, __err) != EDEADLK || (kind != PTHREAD_MUTEX_ERRORCHECK_NP && kind != PTHREAD_MUTEX_RECURSIVE_NP)", file=file@entry=0x7fb7e6bd20 "pthread_mutex_lock.c", line=line@entry=349, 
    function=function@entry=0x7fb7e6bb40 <__PRETTY_FUNCTION__.9092> "__pthread_mutex_lock_full") at assert.c:92
#3  0x0000007fb492ccac in __GI___assert_fail (
    assertion=assertion@entry=0x7fb7e6b9f8 "INTERNAL_SYSCALL_ERRNO (e, __err) != EDEADLK || (kind != PTHREAD_MUTEX_ERRORCHECK_NP && kind != PTHREAD_MUTEX_RECURSIVE_NP)", file=file@entry=0x7fb7e6bd20 "pthread_mutex_lock.c", line=line@entry=349, 
    function=function@entry=0x7fb7e6bb40 <__PRETTY_FUNCTION__.9092> "__pthread_mutex_lock_full") at assert.c:101
#4  0x0000007fb7e616e8 in __pthread_mutex_lock_full (mutex=0xae0dd0) at pthread_mutex_lock.c:347
#5  0x0000007fb7e617fc in __GI___pthread_mutex_lock (mutex=<optimized out>) at pthread_mutex_lock.c:73
#6  0x0000007fb5e3bdbc in ?? () from /usr/lib/aarch64-linux-gnu/tegra/libcuda.so.1
#7  0x0000007fb5e3bdec in ?? () from /usr/lib/aarch64-linux-gnu/tegra/libcuda.so.1
#8  0x0000007fb5d5a108 in ?? () from /usr/lib/aarch64-linux-gnu/tegra/libcuda.so.1
#9  0x0000007fb5e8ae38 in cuGraphicsUnregisterResource () from /usr/lib/aarch64-linux-gnu/tegra/libcuda.so.1
#10 0x0000007fb5e1e838 in ?? () from /usr/lib/aarch64-linux-gnu/tegra/libcuda.so.1
#11 0x0000007fb5e1dfa0 in ?? () from /usr/lib/aarch64-linux-gnu/tegra/libcuda.so.1
#12 0x0000007f9b2d1c1c in ?? () from /usr/lib/aarch64-linux-gnu/tegra-egl/libEGL_nvidia.so.0
#13 0x0000007f9b2d08c4 in ?? () from /usr/lib/aarch64-linux-gnu/tegra-egl/libEGL_nvidia.so.0
#14 0x0000007f9b2d0d0c in ?? () from /usr/lib/aarch64-linux-gnu/tegra-egl/libEGL_nvidia.so.0
#15 0x0000007f9b2d0fcc in ?? () from /usr/lib/aarch64-linux-gnu/tegra-egl/libEGL_nvidia.so.0
#16 0x0000007f9b2d346c in ?? () from /usr/lib/aarch64-linux-gnu/tegra-egl/libEGL_nvidia.so.0
#17 0x0000007f9b260034 in ?? () from /usr/lib/aarch64-linux-gnu/tegra-egl/libEGL_nvidia.so.0
#18 0x0000007f8cb1ae20 in ?? () from /usr/lib/aarch64-linux-gnu/gstreamer-1.0/libgstnvvideosink.so
#19 0x0000007fa9cf8c6c in ?? () from /usr/lib/aarch64-linux-gnu/libgstbase-1.0.so.0
#20 0x0000007f80158f48 in ?? ()

It’s looks like egl related bug, maybe something about egl calls synchronization?

DaneLLL · January 11, 2018, 8:57am

Hi smarttowel0,
Form current information we cannot reproduce the issue.

If it gets stuck at cuGraphicsEGLRegisterImage(), maybe cuGraphicsUnregisterResource() is not called?

smarttowel0 · January 11, 2018, 10:38am

I upload minimal working example to my gdrive [url]https://drive.google.com/file/d/1QHJDhYmg1LNTNkNHC76jVQD9334CMtRW/view?usp=sharing[/url].

I can reproduce this issues with it example. Sometimes it crashed or stucked or worked without problems.

DaneLLL · January 12, 2018, 5:55am

Hi smarttowel0,
Your case runs VisionWorks + OpenCV + gstreamer. Can it be done by VisionWorks + gstreamer or OpenCV + gstreamer? It can be possible to see contraction in running VisioWorks + OpenCV.

DaneLLL · January 12, 2018, 6:43am

Hi smarttowel0,
NvBuffer of MM APIs and video/x-raw(memory:NVMM) of gstreamer are different. We do not support NvBuffer in gstreamer on r28.1.

smarttowel0 · January 12, 2018, 9:01am

In some cases usage of OpenCV more comfortable than VX. But I use VX for highly optimizated solutions.

I did not fully understand, can I send GpuMat to gstreamer pipeline via appsrc with minimal overhead? If yes, how it possible? Maybe you can provide code sample?

Thanks

DaneLLL · January 15, 2018, 2:24am

hi smarttowel0,
You cannot send GPuMat to gstreamer pipeline via appsrc.

For a full gstreamer pipeline, you can access video/x-raw(memory:NVMM) in nvivafilter as shown in
[url]gstreamer NVMM <-> opencv gpuMat - Jetson TX2 - NVIDIA Developer Forums

You can also refer to
[url]CLOSED. Gst encoding pipeline with frame processing using CUDA and libargus - Jetson TX1 - NVIDIA Developer Forums
It demonstrates appsrc(Argus + NvVideoEncoder) → h264parse → qtmux → filesink. You can implement appsrc doing NvBuffer + NvVideoEncoder in your case.

Topic		Replies	Views
Opencv gpu mat into GStreamer without downloading to cpu Jetson Nano opencv , gstreamer	19	8460	October 13, 2021
gstreamer input to opencv process and send back to gstreamer show Jetson TX2	4	2231	October 18, 2021
Nvvidconv colorspace conversion difficulties Jetson Nano	6	1903	October 14, 2021
Nvivafilter: different input and output buffers Jetson Nano gstreamer	9	2204	October 15, 2021
Encountering an NVBuffer error when sending frames to the AppSrc element Jetson Nano opencv , gstreamer	6	360	May 21, 2024
Get frame in GpuMat instead of Mat - OpenCV 3.4.2 - v4l2 - Jetson TX2 Jetson TX2	25	4172	October 18, 2021
Error generated while running the code after connecting the camera Jetson Xavier NX gstreamer , nvbugs	45	1249	January 2, 2024
How to create opencv gpumat from nvstream? DeepStream SDK	36	14494	July 27, 2021
Deepstream Memory Leak DeepStream SDK	8	1932	December 13, 2021
Push OpenCv Mat images into Deepstream pipeline using appsrc DeepStream SDK opencv , gstreamer	2	1668	October 12, 2021

Send OpenCV GpuMat to GStreamer pipeline without memory copy?

Related topics