Issue with access of NVbuffer frame

Hi Guys,

I intend to read frames using live capture of camera and copy into it into cuda host memory and map it to device memory so that it can be used on GPU side. However, when I try to cudaMemCpy and display the image using cv::imshow I always get a NULL strip at the bottom ( black). But when I try to display the image using the NVBuffer pointer the whole frame gets displayed.

Following is the code snippet for reading the frame into character buffer using NVBuffer :

// Acquire a Frame.
        UniqueObj<Frame> frame(iFrameConsumer->acquireFrame());
        IFrame *iFrame = interface_cast<IFrame>(frame);
        if (!iFrame)
            break;

        // Get the Frame's Image.
        Image *image = iFrame->getImage();
        EGLStream::NV::IImageNativeBuffer *iImageNativeBuffer
              = interface_cast<EGLStream::NV::IImageNativeBuffer>(image);
        TEST_ERROR_RETURN(!iImageNativeBuffer, "Failed to create an IImageNativeBuffer");

        int fd = iImageNativeBuffer->createNvBuffer(Argus::Size {m_framesize.width, m_framesize.height},
               NvBufferColorFormat_YUV420, NvBufferLayout_Pitch, &status);
        if (status != STATUS_OK)
               TEST_ERROR_RETURN(status != STATUS_OK, "Failed to create a native buffer");

 #if 1

cudaSetDeviceFlags(cudaDeviceMapHost);

NvBufferParams params;
NvBufferGetParams(fd, &params);

char *data_mem = NULL;
int size = m_framesize.width* m_framesize.height;

data_mem = (char *)mmap(NULL, fsize, PROT_READ | PROT_WRITE, MAP_SHARED, fd, params.offset[0]);

When I try to imshow using data_mem pointer it works. Following is the working code snippet :

cv::Mat CudaOUTimgbuf1 = cv::Mat(m_framesize.height, m_framesize.width, CV_8UC1, (void *) data_mem , params.pitch[0];
	
cv::imshow("CudaOUTimgbuf1", CudaOUTimgbuf1);
cv::waitKey(1);

When I try to allocate cuda host memory and copy the contents of frame buffer using cudaMemCpy, I get a black NULL strip at the bottom of the window. Following is the code snippet:

char *h_myimagen = NULL;

char *d_myimagen = NULL;

int alloc1 = cudaHostAlloc((void **)&h_myimagen, ( m_framesize.height *  m_framesize.width)*sizeof(unsigned char), cudaHostAllocMapped);
	
int getPtr1 = cudaHostGetDevicePointer((void **)&d_myimagen, (void *) h_myimagen, 0);	

int copy1 = cudaMemcpy (d_myimagen,data_mem,m_framesize.width*m_framesize.height*sizeof(unsigned char),cudaMemcpyHostToDevice) ;	

cv::Mat CudaOUTimgbuf1 = cv::Mat(m_framesize.height, m_framesize.width, CV_8UC1, (void *) h_myimagen , params.pitch[0];
	
cv::imshow("CudaOUTimgbuf1", CudaOUTimgbuf1);
cv::waitKey(1);

Also when I try to copy the buffer into cuda host memory using a nested loop instead of cudaMemCpy the same result is displayed. Following are the queries in this context:

  1. My guess is that there is some trouble with allocation and access of allocated memory. Please correct me if I am wrong and guide me how to proceed forward.

  2. When I change the size of the frame to 1920x1080, the program crashes with Segmentation Fault. The program works fine when the frame size is 640x480 or 1080x720 (but gives NULL output at the bottom of the frame). Is there an issue with allocating a bigger cuda memory ? Why does it crash ? If not, is there an issue with displaying the frame using cv::imshow?

Thanks

Hi,

  1. It should work. Please check this sample:
#include <iostream>
#include <opencv2/core.hpp>
#include <opencv2/imgproc.hpp>
#include <opencv2/highgui.hpp>
#include <cuda_runtime.h>
#include <helper_functions.h>
#include <helper_cuda.h>
#include <helper_timer.h>

int main ()
{
    cv::Mat img = cv::imread("cat.jpg");
    cv::Mat gray;
    cv::cvtColor(img, gray, CV_BGR2GRAY);
    int width = img.size().width;
    int height = img.size().height;

    char *h_myimagen = NULL;
    char *d_myimagen = NULL;

    int alloc1 = cudaHostAlloc((void **)&h_myimagen, ( height*width )*sizeof(unsigned char), cudaHostAllocMapped);
    int getPtr1 = cudaHostGetDevicePointer((void **)&d_myimagen, (void *) h_myimagen, 0);	

    cudaMemcpy( d_myimagen, gray.data, width*height*sizeof(char), cudaMemcpyHostToDevice) ;	
    cv::Mat CudaOUTimgbuf1 = cv::Mat(height, width, CV_8UC1, (void *)h_myimagen);

    cv::imshow("ALL GOODS", CudaOUTimgbuf1);
    cv::waitKey(0);
    return 0;
}
nvcc topic_1018809.cpp -lopencv_core -lopencv_highgui -lopencv_imgproc -I/home/ubuntu/NVIDIA_CUDA-8.0_Samples/common/inc -o test && ./test
  1. There are lots of possible reasons. Please check camera buffer setting first.

Thanks.

Hi AastaLLL,

Thanks for the help. The code works fine. Could you let me know if there is any way to avoid the cudaMemcpy call ? I wish to read a frame from camera and map it into GPU memory without copying the entire buffer.

Thanks.

Try to read camera frame directly to the d_myimagen.
We have lots of examples to demonstrate how to read an image from the camera; please check tegra multimedia api.

I use memorycpy() just because of OpenCV store image in CPU memory.
Thanks.

Hi AastaLLL,

I have tried to read camera frame into NVBuffer taking help from the samples in tegra multimedia api. Do you suggest any other way of reading camera frame into memory which could be accessed from GPU as well without need of memory copy ?

Could you please point to any sample which is doing the same ?

Thanks.

Hi,

If your framework is OpenCV, you can try GStreamer to get camera data.
GStreamer can get image directly from cv::VideoCapture()

Default Opencv4Tegra doesn’t enable GStreamer support, please build from source with this tutorial:
http://dev.t7.ai/jetson/opencv/

Thanks.