NvBufferGetParams failed

Hello,I use xaiver on Jetpack 4.4. And I have code the function about capture camera data to do cuda process and encode.
My gstreamer pipeline like :

appsrc name=appsrc ! video/x-raw(memory:NVMM),format=YUY2,width=1920,height=1080,framerate=30/1 ! nvvidconv ! video/x-raw(memory:NVMM), format=NV12, width=1920, height=1080,framerate=(fraction)30/1 ! nvv4l2h264enc control-rate=constant_bitrate bitrate=24000000 iframeinterval=0 profile=0 maxperf-enable=true all-iframe=true num-Ref-Frames=0 insert-sps-pps=false ! video/x-h264, stream-format=(string)byte-stream ! h264parse ! qtmux ! filesink location=/tmp/today/sensors_record/camera//center_camera_fov30.h264

I can make sure that the dmabuf_fd is work.Because I run yuyv2bgr process and write the bgr data to check,and then I would like to encode the data. But NvBufferGetParams run failed,that make me confused.

If I need to use NvBufferSession with mulithread process? Is there any sample?

another question : If NvEGLImageFromFd can not to run on mulithread ? Because if I run 2 cameras or more,NvEGLImageFromFd also failed.But one camera run success.And How should I do?

Hi,
It is not clear what the issue is. Please try this sample:
Opencv gpu mat into GStreamer without downloading to cpu - #15 by DaneLLL

And check if you can share a patch on the sample so that we can replicate the issue and investigate.

Hi,
I use callback function when get image from camera,and each camera capture process runs independently on a thread.
this is my callback function to transfer dmabuf_fd:

void CameraCallBack(int nChan,struct timespec stTime,int nWidth,int nHeight,int nDatalen,int dmabuf_fd) {
            std::unique_lock<std::mutex> lock(data_mutex_);
            if (dmabuf_fd <= 0) {
                AD_LERROR << "frame error : chan[" << nChan<< "]" << cpld_timestamps_ns;
                return;
            }
            EGLImageKHR egl_image;
            egl_image = NvEGLImageFromFd(egl_display_, dmabuf_fd);
            if (egl_image == NULL) {
                AD_LERROR<< "NvEGLImageFromFd failed chan[" << nChan << "]"
                   << " dmabuf_fd "<< dmabuf_fd;
                return;
            }
            CUresult status;
            CUeglFrame eglFrame;
            CUgraphicsResource pResource = NULL;
            cudaFree(0);
            status = cuGraphicsEGLRegisterImage(&pResource, egl_image, CU_GRAPHICS_MAP_RESOURCE_FLAGS_NONE);
               if (status != CUDA_SUCCESS) {
                AD_LERROR<< "EGLRegisterImage failed chan[" << nChan << "]" << status
                    << " dmabuf_fd "<< dmabuf_fd;
                return;
            }
            status =cuGraphicsResourceGetMappedEglFrame(&eglFrame, pResource, 0, 0);
            if (status != CUDA_SUCCESS) {
                AD_LERROR<< "GetMappedEglFrame failed chan[" << nChan << "]"
                    << status<< " dmabuf_fd " << dmabuf_fd;
                return;
            }
            status = cuCtxSynchronize();

            cudaYUYVToBGR(
                (unsigned char *)eglFrame.frame.pPitch[0] +
                   yuv422_data_offsets_[nChan],gpu_image_data_ +
                    bgr_data_offsets_[nChan],nWidth, nHeight,
                cv::gpu::StreamAccessor::getStream(cuda_streams_[vector_index]));

            status = cuCtxSynchronize();
            status = cuGraphicsUnregisterResource(pResource);
            if (status != CUDA_SUCCESS) {
                AD_LERROR
                    << "UnregisterResource failed chan[" << nChan << "]"
                    << status << " dmabuf_fd " << dmabuf_fd;
                return;
            }
            NvDestroyEGLImage(egl_display_, egl_image);
            if (status != CUDA_SUCCESS) {
                AD_LERROR
                    << "NvDestroyEGLImage failed chan[" << nChan << "]"
                    << status << " dmabuf_fd " << dmabuf_fd;
                return;
            }

          if (encoder_[nchan] != nullptr) {
              encoder_[nchan]->WriteFrameWithFD(dmabuf_fd, camera_timestamps_ns);
          }
}

and this is WriteFrameWithFD function:

bool WriteFrameWithFD(int dmabuf_fd) {
    GstClockTime duration, timestamp;
    duration = gst_util_uint64_scale_int(1, GST_SECOND, fps_);
    timestamp = num_frames_ * duration;
    GstBuffer *buffer;
    GstFlowReturn ret;
    GstMapInfo map = {0};
    NvBufferParams par;
    gpointer data = NULL, user_data = NULL;
    user_data = g_malloc(sizeof(int));
    GST_INFO("NvBufferCreate %d", dmabuf_fd);
    GstMemoryFlags flags = (GstMemoryFlags)0;
    *reinterpret_cast<int *>(user_data) = dmabuf_fd;

    if (NvBufferGetParams(dmabuf_fd, &par)) {
        AD_LERROR(GSTREAMER_ENCODER)
            << "user_data: " << *reinterpret_cast<int *>(user_data)
            << std::endl;
        AD_LERROR(GSTREAMER_ENCODER)
            << "NvBufferParams: dma_df " << par.dmabuf_fd
            << " payloadType :" << par.payloadType
            << " memsize :" << par.memsize
            << " nv_buffer_size : " << par.nv_buffer_size << " pixel_format  "
            << par.pixel_format << std::endl;
    } else {
        AD_LERROR(GSTREAMER_ENCODER)
            << "NvBufferGetParams error: " << std::endl;
        return false;
    }
    data = g_malloc(par.nv_buffer_size);

    buffer = gst_buffer_new_wrapped_full(flags, data, par.nv_buffer_size, 0,
                                         par.nv_buffer_size, user_data,
                                         notify_to_destroy);
    GST_BUFFER_DURATION(buffer) = duration;
    GST_BUFFER_PTS(buffer) = timestamp;
    GST_BUFFER_DTS(buffer) = timestamp;
    // set the current number in the frame
    GST_BUFFER_OFFSET(buffer) = num_frames_;

    gst_buffer_map(buffer, &map, GST_MAP_WRITE);
    memcpy(map.data, par.nv_buffer, par.nv_buffer_size);
    gst_buffer_unmap(buffer, &map);

    g_signal_emit_by_name(source_, "push-buffer", buffer, &ret);
    gst_buffer_unref(buffer);

    num_frames_++;
    return true;
}

so there are two problem:

  1. only run cudaYUYVToBGR without encode:run success when one camera,but NvBufferGetParams failed when 2 cameras or more
  2. run cudaYUYVToBGR and encode with one camera : NvBufferGetParams failed when WriteFrameWithFD

Hi
BGR is not supported in NvBuffer. Please create NvBuffer in RGBA and call NvBufferTransform to convert YUV422 to RGBA

Hi,
camera buffer format is yuv422,I just convert BGR data into gpu_image_data_ .And encode function use the yuv422 dmabuffer.
no matter how it shouldn’t NvBufferGetParams return failed
with

appsrc name=appsrc ! video/x-raw(memory:NVMM),format=YUY2,width=1920,height=1080,framerate=30/1 ! nvvidconv ! video/x-raw(memory:NVMM), format=NV12, width=1920, height=1080,framerate=(fraction)30/1 ! nvv4l2h264enc control-rate=constant_bitrate bitrate=24000000 iframeinterval=0 profile=0 maxperf-enable=true all-iframe=true num-Ref-Frames=0 insert-sps-pps=false ! video/x-h264, stream-format=(string)byte-stream ! h264parse ! qtmux ! filesink location=/tmp/today/sensors_record/camera//center_camera_fov30.h264

Hi,
You would need to create NvBuffer by calling NvBufferCreate, and than call NvBufferGetParams to get information of the NvBuffer.

Hi,
It make me confused,because all sample do not create NvBuffer before NvBufferGetParams .I have called the NvBufferCreate when initialization

initDma() {
    m_pNvbuff = new nv_buffer[V4L2_BUFFER_LENGHT];
    NvBufferCreateParams input_params = {0};
    input_params.payloadType = NvBufferPayload_SurfArray;
    input_params.width = m_dwWidth;
    input_params.height = m_dwHeight;
    input_params.layout = NvBufferLayout_Pitch;

    /* Create buffer and provide it with camera */
    for (unsigned int index = 0; index < V4L2_BUFFER_LENGHT; index++) {
        int fd;
        // NvBufferParams params = {0};
        input_params.colorFormat = get_nvbuff_color_fmt(V4L2_VIDEO_FORMAT);
        input_params.nvbuf_tag = NvBufferTag_CAMERA;
        if (-1 == NvBufferCreateEx(&fd, &input_params)) {
            LIBCAMERA_ERROR(
                "NvBufferCreateEx failed,devname=%s,errno: %d, %s\n",
                m_strDevName.c_str(), errno, strerror(errno));
            return false;
        }
        m_pNvbuff[index].dmabuff_fd = fd;
        if (-1 == NvBufferMemMap(m_pNvbuff[index].dmabuff_fd, 0,
                                 NvBufferMem_Read_Write,
                                 (void **)&m_pNvbuff[index].start)) {
            LIBCAMERA_ERROR("NvBufferMemMap failed,devname=%s,errno: %d, %s\n",
                            m_strDevName.c_str(), errno, strerror(errno));
            return false;
        }
    }

    bool bRet = true;
    struct v4l2_requestbuffers req;
    memset(&req, 0, sizeof(struct v4l2_requestbuffers));
    req.count =V4L2_BUFFER_LENGHT;
    req.type = V4L2_BUF_TYPE_VIDEO_CAPTURE;
    req.memory = V4L2_MEMORY_DMABUF; 

    if (-1 == xioctl(m_videoFd, VIDIOC_REQBUFS, &req)) {
        LIBCAMERA_ERROR(
            "xioctl  VIDIOC_REQBUFS failed,devname=%s,errno: %d, %s\n",
            m_strDevName.c_str(), errno, strerror(errno));
        return false;
    }

    for (unsigned int index = 0; index < V4L2_BUFFER_LENGHT; index++) {
        struct v4l2_buffer buf;

        /* Query camera v4l2 buf length */
        memset(&buf, 0, sizeof buf);
        buf.index = index;
        buf.type = V4L2_BUF_TYPE_VIDEO_CAPTURE;
        buf.memory = V4L2_MEMORY_DMABUF;

        if (ioctl(m_videoFd, VIDIOC_QUERYBUF, &buf) < 0) {
            bRet = false;
            LIBCAMERA_ERROR(
                "xioctl  VIDIOC_QUERYBUF failed,devname=%s,errno: %d, %s\n",
                m_strDevName.c_str(), errno, strerror(errno));
            break;
        }

        buf.m.fd = (unsigned long)m_pNvbuff[index].dmabuff_fd;
        if (buf.length != m_pNvbuff[index].size) {
            // LIBCAMERA_ERROR("Camera v4l2 buf length is not
            // expected,devname=%s,errno: %d, %s\n", m_strDevName.c_str(),
            // errno, strerror(errno));
            m_pNvbuff[index].size = buf.length;
        }
    }

    if (bRet == false) {
        if (m_pNvbuff != NULL) {
            for (unsigned i = 0; i < V4L2_BUFFER_LENGHT; i++) {
                if (m_pNvbuff[i].dmabuff_fd) {
                    NvBufferDestroy(m_pNvbuff[i].dmabuff_fd);
                }
            }
            delete[] m_pNvbuff;
            m_pNvbuff = NULL;
        }
    }
    return bRet;
}

and capture the camera with

        case IO_METHOD_DMA:
            buf.type = V4L2_BUF_TYPE_VIDEO_CAPTURE;
            buf.memory = V4L2_MEMORY_DMABUF;
            if (-1 == xioctl(m_videoFd, VIDIOC_DQBUF, &buf)) {
                bRet = false;
                LIBCAMERA_ERROR(
                    "xioctl VIDIOC_DQBUF failed,dev=%s,error:%d:%s\n",
                    m_strDevName.c_str(), errno, strerror(errno));
                return -1;
            }
            NvBufferMemSyncForDevice(m_pNvbuff[buf.index].dmabuff_fd, 0,
                                     (void **)&m_pNvbuff[buf.index].start);
            srcData = m_pNvbuff[buf.index].start;
            srcDatalen = m_pNvbuff[buf.index].size;
            break;

Hi,
Looks like you are capturing YUV422 through v4l2. For this use-case, please refer to

/usr/src/jetson_multimedia_api/samples/12_camera_v4l2_cuda

It demonstrates capturing YUV422 into NvBuffer and convert to another YUV420 NvBuffer. You can change YUV420 NvBuffer to RGBA NvBuffer and give it a try.

Hi,
Thanks for your reply, i will give it a try.And if BGRx is ok?
Because the nvvidconv is support the BGRx(NVMM),it will be better.

Hi,
You can use BGRx if you don’t need to call NvEGLImageFromFd() and cuGraphicsEGLRegisterImage(). The format is not supported in CUDA. If you need to access and process the buffer through CUDA, please create RGBA NvBuffer.

Hi,
I have run the 12_camera_v4l2_cuda sample, but the render show the discorrect image.
I can make sure that before NvBufferTransform, the v4l2 buffer is good.Because I have checked by write disk.
And I didn’t modify the sample.
the render show :

Hi,
You may try v4l2-ctl command check if the frame data is correct. If it works in v4l2-ctl, it shall work the same in jetson_multimedia_api and gstreamer.

use the v4l2-ctl command to check the image, it is work.Like I said that the v4l2 buffer is good,before NvBufferTransform.
And Gstreamer is good to encode with memory which memcpy from v4l2 dma nvbuffer.

v4l2-ctl --set-fmt-video=width=2880,height=1860,pixelformat=YUYV --set-ctrl bypass_mode=0 --stream-mmap --stream-count=1 --stream-to=imx490.raw -d /dev/video0

Hi,
The resolution is unique and needs to consider data alignment:
YUV camera(5.4M) preview issue - #19 by DaneLLL

Please add one line to manually set ctx->capture_dmabuf=0 and give it a try.

Hi,
It is work after set dmabuf=false,but I am concerned about memory copy issues.I don’t want to capture frame data to CPU buffer first and then copy to NvBuffer.
What can I do the next?And if just need to consider the data alignment on width?
That is the reason:

Hi,
NvBuffer is hardware DMA buffer so the alignment is fixed. Is it possible to change your sensor driver to fit pitch, width, height of NvBuffer? Or capture frame size in standard resolution like 1920x1080, to have pitch=width.

Hi,
It is hard to change the sensor driver, because the driver is not developed on myself.I don’t know how to do that capture frame size in standard resolution like 1920x1080, to have pitch=width .Can you give me some support?

And if the second way is hard to do.I want to push cpu data into gstreamer but not NVMM.Want to know that which way is better below:

use userptr by zerocopy feature, and do the cudaprocess with the capture camera data and push the capture camera data into gstreamer.

or

use dmabuffer, and run NvBufferMemSyncForDevice , do the cudaprocess with the cuGraphicsResourceGetMappedEglFrame data by dmabuf_fd and push the capture camera data into gstreamer pipeline

It seems that it not work on multithread(camera) on the second way : NvBufferGetParams failed - #4 by zyukyunman

And how can I further resolve the encoding performance issue with gstreamer?

Hi,
If width is not equivalent to pitch, the memory copy is obligatory. The frame data has to be copied to NvBuffer line by line. It’s demonstrated in Raw2Nvbuffer(). It is same in using gstreamer nvvidconv plugin. It also calls Raw2Nvbuffer() in the plugin.

You can run jetson_clocks to enable CPU cores at maximum clock. And check if there’s performance enhancement.

Hi,
Thanks for your reply,So it seems that it is best to

  1. use userptr by zerocopy feature, and do the cudaprocess with the YUYV data.

  2. And then can allocate NvBuffers in NvBufferColorFormat_YUYV and call NvBufferTransform() to convert to YUV420(NvBufferColorFormat_YUV420),push into apprsc - > nvv4l2h264enc without nvvidconv

If my description is correct, I will give it a try. Where I can find the source coed about Raw2Nvbuffer?

Hi,
It looks to be a possible solution. You can implement like:

  1. capture YUYV frame data into CUDA buffer
  2. copy the data into YUYV NvBuffer line by line(handle the canse pitch != width)
  3. Convert to YUV420 for video encoding

Raw2Nvbuffer() is not public. For capturing frame data into CUDA buffer, please refer to

/usr/src/jetson_multimedia_api/samples/v4l2cuda