IImageNativeBuffer adds 160 bytes to each line of image

zsherin · February 24, 2022, 4:15pm

I am writing a LibArgus video streaming application. We have a camera attached to a Jetson Xavier AGX Development Kit. At a high level, what the application does is get images from LibArgus, copy them to pre-allocated CUDA memory, and then the Rivermax API (also an Nvidia product but not relevant to this issue) sends that data out over the network.

I capture a single frame like so, where I use the iFrameConsumer interface to get a frame, treat it as an ImageNativeBuffer, and then either copy it to a pre-existing NvBuffer or create one. Specifically, I create them at the desired width and height of the images I am capturing off of our sensors.

UniqueObj<Frame> frame(iFrameConsumer->acquireFrame());
NV::IImageNativeBuffer *iNativeBuffer =
            interface_cast<NV::IImageNativeBuffer>(iFrame->getImage());
if (!cur_frame.m_buffer_created)
{
    std::cout << "NVBUF ATTEMPT" << std::endl;
    std::cout << "FINAL RESOLUTION: " << iEglOutputStream->getResolution().width() << " X " << iEglOutputStream->getResolution().height() << std::endl;
    cur_frame.m_dmabuf = iNativeBuffer->createNvBuffer(iEglOutputStream->getResolution(),
                                                NvBufferColorFormat_ARGB32,
                                                NvBufferLayout_Pitch);
    
    NvBufferParams par;
    NvBufferGetParams (cur_frame.m_dmabuf, &par);
    cur_frame.m_params = par;
    if (!cur_frame.m_dmabuf)
        CONSUMER_PRINT("\tFailed to create NvBuffer\n");
    cur_frame.m_buffer_created = true;
}
else if (iNativeBuffer->copyToNvBuffer(cur_frame.m_dmabuf) != STATUS_OK)
{
    ORIGINATE_ERROR("Failed to copy frame to NvBuffer.");
}

Later, I pull these frames out of a queue to copy them into pre-allocated hardware memory so that Rivermax can transmit them. I have a bunch of memory sitting in data_ptr waiting to be copied to. I use NvBuffer2Raw to copy that NvBuffer’s data into the data_ptr.

///Pop a frame off the queue
g_transmit_queue.wait_dequeue_timed(cur_frame, std::chrono::seconds(2))
///payload is pre-allocated hardware memory
unsigned char *data_ptr = reinterpret_cast<unsigned char *>(payload);
///Use NvBuffer2Raw to copy the frame from LibArgus to Rivermax memory
///baseWidth and baseHeight are the video stream resolution - either 1920x1080 or 2328x1744 as explained below
int ret = NvBuffer2Raw(cur_frame.m_dmabuf, 0, baseWidth,baseHeight, data_ptr);

The issue that I am having, is that if I set my requested resolution (baseWidth x baseHeight) for the video stream to 1920x1080, all of this works perfectly fine. However, our camera has a native resolution of 2328x1744. When I set the camera streams to give me that resolution, everything appears to work normally. However, while the NvBuffer has a width of 2328 and a height of 1744, its pitch is 9472. Each pixel is 4 bytes (ARGB pixel format), which leads us to discover that each line is 2368 pixels long. If I attempt to use the NvBuffer2Raw command as above, I get a completely broken image with 40 extra pixels of black space at the end of each line. That looks like this (ignore the bad, blurry imagery, its not the question)

If I instead change the NvBuffer2Raw command to the following, it creates a correct image with 40 black pixels at the end of each line:

int ret = NvBuffer2Raw(cur_frame.m_dmabuf, 0, cur_frame.m_params.pitch[0]/4, baseHeight, data_ptr);

Because of the nature of my streaming system, I would like to not have these extra 40 black pixels at the edge of the image. It also would be nice to not have to allocate the buffers, understand the difference between expected width and actual width, and then send that data to the streaming part of the application. Overall, I would like to get the following answers:

Is it possible to copy the dmabuf of my NvBuffer correctly to data_ptr? NvBuffer2Raw cannot copy the image correctly without the extra 40 pixels of black space, and that is unacceptable for my application.
If not, is there a way to ensure that the NvBuffer only has exactly the data I need, and not the extra blank space?
Is there a better way to copy my LibArgus frames into the data_ptr? Should I be using an interface other than IImageNativeBuffer?

Please let me know if you would like further information, and thank you for your time.

For reference, I allocate the memory that data_ptr/payload are referring to here:


std::cout << "CUDA memory allocation on Host -  cudaMallocHost " << std::endl;
cudaError_t cuda_ret = cudaMallocHost((void**)&m_block[block_idx].data_ptr, size_in_bytes);
if (cuda_ret != cudaSuccess) {
   std::cout << "CUDA memory allocation on Host failed !!! Error: " << cuda_ret << std::endl;
   return false;
}

ShaneCCC · February 25, 2022, 7:38am

The NvBuffer design as 256 alignment So software need to handle that.

zsherin · February 28, 2022, 6:26pm

Hi ShaneCCC,

Thanks for the reply, that makes sense. Do you have any ideas on my other question, namely

Is there a better way to copy my LibArgus frames into the data_ptr? Should I be using an interface other than IImageNativeBuffer?

Is there any better way to move data from LibArgus into a CUDA pointer than NvBuffer2Raw? If not, or if I should open another topic, I’m happy to close this and do so.

Thanks,
Zach

ShaneCCC · March 1, 2022, 4:19am

Maybe you can reference to cudaHistogram/cudaBayerDemosaic in MMAPI.

zsherin · March 1, 2022, 2:29pm

Thank you! I’ll see what I can dig up there.

system · March 23, 2022, 6:23am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.