Using Argus NvVideoConverter and OpenCV V4L2_PIX_FMT_GREY

I’ve created an application based off of the tegra_multimedia_api/samples/11_camera_object_identification example.

It works great, I get video coming out of the image converter as fast as the camera can supply them. Just like the sample, the images coming from the ISP are converted by the video converter from YUV420M to ABGR32. This data can be copied neatly into a opencv mat with the type CV_8UC4.

I really want a grayscale image. I can use cvtColor to convert from CV_BGR2GRAY but I was hoping there was more of a direct route so I thought I could convert the incoming video from YUV420M to GREY using the video converter and then create an opencv::MAT with the image type CV_8UC1 but I don’t get the output I would expect.

Here is an image that was captured using:
V4L2_PIX_FMT_YUV420M -> V4L2_PIX_FMT_ABGR32
then converted with opencv using: cvtColor(incomming, outgoing, CV_BGR2GRAY)

Here is an image that was captured using:
V4L2_PIX_FMT_YUV420M -> V4L2_PIX_FMT_GREY

Any help would be appreciated.

Thanks,

Dave

When I look close to the image it looks like the first line is completely correct but there is a tail of black added to the next line. In general It’s like there is extra black added to the end of a row.

I’ve been looking around and one of the functions that the video converter does is changes the way the image is stored in memory. In particular there is a memory format called ‘pitched’. avidday from another forum post states this about pitched memory:

‘Pitched linear memory is just a linear memory allocation calculated from the 2D sizes you provide, with padding added as required to ensure row major access will be correctly aligned for coalesced memory access.’

Here’s the forum post that I got this from: https://devtalk.nvidia.com/default/topic/462098/pitch-linear-memory/

Is it possible that this is an artifact of ‘pitched’ memory being mapped to an opencv mat incorrectly?

It looks like the extra part of the row is 1/5th the distance of the row or 640 + 640 / 5 = 768 which makes some sense because this is aligned with the a 256 byte row of memory (768 / 3 = 256).

We did confirm that the memory was layed out in a ‘pitched’ configuration. We used the ‘stride’ value from the plane which was 768 to copy the data over using

for (int i=0; i < image_height; i++){
  memcpy(&out_data[i * width], &in_data[i * stride], width);
}

But this still outputted a strange image, it wasn’t until we forced the stride to be 1024 that we were able to get a grayscale image that was correct.

Hope this helps someone else who might get stuck here

We have confirmed cospan’s observation. For 640x480 input, the output is 768x480. However, we don’t observe the issue that it has to be 1024 when stride is 768.

To dump output, we put the following code in ConsumerThread::converterCapturePlaneDqCallback()

if (dump) {
    printf("stride %u width %u height %u \n",
        buffer->planes[0].fmt.stride,buffer->planes[0].fmt.width,buffer->planes[0].fmt.height);
    std::ofstream *outputFile = new std::ofstream("dump.GREY");
    outputFile->write((char *)buffer->planes[0].data, buffer->planes[0].bytesused);
    delete outputFile;
    dump=false;
}

For similar cases, please ensure to check both width and stride.