NvBufferGetParams failed

Hi,
follow the v4l2cuda sample, I capture YUYV frame, and try to use memcpy the cuda buffer to cpu buffer, the process core dump.The cuda buffer is malloc by cudaMallocManaged, why the buffer can not memcpy, but can be accessed by fwrite.

Hi,
I follow your suggestion, but but what I get is a green screen video.

  1. success to capture YUYV frame data into CUDA buffer
  2. copy the data into YUYV NvBuffer which create and copy by these code:
NvBufferCreateParams input_params = {0};
        input_params.payloadType = NvBufferPayload_SurfArray;
        input_params.width = camera_config_.cameras[idx].width;
        input_params.height = camera_config_.cameras[idx].height;
        input_params.layout = NvBufferLayout_Pitch;
        input_params.colorFormat = NvBufferColorFormat_YUYV;
        input_params.nvbuf_tag = NvBufferTag_NONE;
        if (-1 == NvBufferCreateEx(&yuyv422_fds_[idx], &input_params)) {
               AD_LERROR(CameraTztek) << "NvBufferCreateEx failed";
        }
Raw2NvBuffer(pData, 0, nWidth, nHeight,p->yuyv422_fds_[vector_index])
  1. Convert to YUV420 for video encoding
    usings pipeline:
appsrc name=appsrc ! video/x-raw(memory:NVMM),format=YUY2,width=1920,height=1080,framerate=30/1 ! nvvidconv ! video/x-raw(memory:NVMM), format=NV12, width=1920, height=1080,framerate=(fraction)30/1 ! nvv4l2h264enc control-rate=constant_bitrate bitrate=24000000 iframeinterval=0 profile=0 maxperf-enable=true all-iframe=true num-Ref-Frames=0 insert-sps-pps=true ! video/x-h264, stream-format=(string)byte-stream ! h264parse config-interval=-1 ! appsink name = appsink

and push buffer by:

Status_t WriteFrameWithNvBuffer(const int yuyv_fd, const uint64_t &timestamp_ns) {
    GstClockTime duration, timestamp;
    duration = gst_util_uint64_scale_int(1, GST_SECOND, fps_);
    timestamp = num_frames_ * duration;
    GstBuffer *buffer;
    GstFlowReturn ret;
    GstMapInfo map = {0};
    NvBufferParams par;
    gpointer data = NULL, user_data = NULL;
    user_data = g_malloc(sizeof(int));
    GstMemoryFlags flags = (GstMemoryFlags)0;
    *reinterpret_cast<int *>(user_data) = yuyv_fd;
    if (SUCCESS != NvBufferGetParams(yuyv_fd, &par)) {
        return false;
    }
    data = g_malloc(par.nv_buffer_size);
    buffer =
        gst_buffer_new_wrapped_full(flags, data, par.nv_buffer_size, 0,
                                    par.nv_buffer_size, user_data, nullptr);
    GST_BUFFER_DURATION(buffer) = duration;
    GST_BUFFER_PTS(buffer) = timestamp;
    GST_BUFFER_DTS(buffer) = timestamp;
    // set the current number in the frame
    GST_BUFFER_OFFSET(buffer) = num_frames_;
    gst_buffer_map(buffer, &map, GST_MAP_WRITE);
    memcpy(map.data, par.nv_buffer, par.nv_buffer_size);
    gst_buffer_unmap(buffer, &map);
    g_signal_emit_by_name(source_, "push-buffer", buffer, &ret);
    gst_buffer_unref(buffer);
    if (timestamp_writer_ != nullptr && auto_write_frame_) {
        fprintf(timestamp_writer_, "%d, %ld\n", num_frames_, timestamp_ns);
    }
    num_frames_++;
    return SUCCESS;
  }

I have check that Raw2Nvbuffer return success,but doesn’t finish to copy. How can I do?

Hi,
In this way, it copies data through CPU. May still take certain CPU usage. We suggest get CUDA pointer of NvBuffer. So that you can copy the data by calling cudaMemcpy(d_b,d_a,memSize,cudaMemcpyDeviceToDevice). Please consider pitch/width and copy it line by line.

Hi,
Thanks for your reply first.

Do you mean I need to replace the raw2buffer function with cudaMemcpy? I use userptr mode to capture camera data which malloc by cudaMallocManaged, can you tell me which part I need to modify?

I changed the calling order of cuda process and encoding and successfully recorded some data. But there will be some exception frames, and some error prints: NVMAP_IOC_WRITE failed: Interrupted system call

Hi,
Is there any sample about how to copy it line by line.

Hi,
There is no existing code for this use-case. Would need to refer to jetson_multimedia_api samples and developer this use-case. For getting CUDA pointer of Nvbuffer, please call the functions:

NvEGLImageFromFd();
cuGraphicsEGLRegisterImage();
cuGraphicsResourceGetMappedEglFrame();

Hi,
I have successed to encode with nvbuffer, but still on high cpu usage. Which way is better:

  1. use NvBufferTransform transfer into NV12,and push nvbufer into nvv4l2h264enc.

  2. push yuyv422 nvbufer into nvvidconv, and use nvvidconv to transfer the data.

What nvvidconv actually does?

Hi,
From the previous comments, the CPU usage should be from copying YUV422 data from CPU buffer to NVMM buffer(NvBuffer). Since there is constraint in data alignment for NVMM buffer, so the frame data cannot be put into NVMM buffer directly and has to be copied line by line.

The nvvidconv plugin is implemented through NvBuffer APIs. It is same as demonstrated in jetson_multimedia_api samples. From Jetpack 4.5, the plugins are open source, please download the source code package and check the source code. You can see the link in release page:
https://developer.nvidia.com/embedded/linux-tegra-r3272

L4T Driver Package (BSP) Sources

using cudaMemcpy to copy the imx390 data is good to encode, but not on imx490.Actually I don’t quite understand this sentence: Please consider pitch/width and copy it line by line.

Hi,
The captured frame data is in 2880x1860(2880x1860x2 bytes) continuously. And allocated NvBuffer is in 2944x1860(with width=2880). So you would need to copy first line 2880 pixels(2880x2 bytes), jump 2944-2880 pixels in NvBuffer to second line, copy second line 2880 pixels(2880x2 bytes), jump 2944-2880 pixels in NvBuffer to third line, and copy to 1860th line.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.