NvBuffer Weirdness

numbersnerd · May 16, 2017, 7:54pm

I am attempting to capture an image, compute using individual pixels, and repeat rapidly. The sample code oneShot was used verbatim to establish a CameraProvider interface through enabling an outputstream. Pixel values were then accessed by writing individual images to a NvBuffer and then to an OpenCV Mat array for processing. The code that grabs the image, creates the NvBuffer, and maps it to an opencv object (with most comments and error checking removed) is below:

iCaptureSession -> repeat(request.get());

	// initialize counter of frame captures
	int iCount = 0;

	while(iCount <= maxCount)
	{
		// issue capture request
		uint32_t requestId = iCaptureSession->capture(request.get());

		Argus::UniqueObj<EGLStream::Frame> frame(iFrameConsumer->acquireFrame(FIVE_SECONDS_IN_NANOSECONDS, &status));

		EGLStream::IFrame *iFrame = Argus::interface_cast<EGLStream::IFrame>(frame);

		// get the image from the frame
		EGLStream::Image *image = iFrame->getImage();

		/* write image data to NvBuffer */
		EGLStream::NV::IImageNativeBuffer *iImageNativeBuffer = interface_cast<EGLStream::NV::IImageNativeBuffer>(image);

		int dmabuf_fd = iImageNativeBuffer->createNvBuffer(Size {nImageWidth, nImageHeight}, NvBufferColorFormat_YUV420, NvBufferLayout_Pitch, &status);

		NvBufferParams params;

		int ret = NvBufferGetParams(dmabuf_fd, &params);

		/*
		*	convert image data in the NvBuffer to YUV image data
		*/

		for(int i = 0 ; i<params.num_planes ; i++)
		{
			int32_t width = params.width[i];
			int32_t height = params.height[i];
			int32_t pitch = params.pitch[i];

			size_t fsize = pitch*height;

			uint8_t* data_mem = (uint8_t*)mmap(0, fsize, PROT_READ | PROT_WRITE, MAP_SHARED, dmabuf_fd, params.offset[i]);

			ycbcr_split[i] = cv::Mat (height, width, CV_8UC1, data_mem, pitch);
		}

		// free memory for NvBuffer
		NvBufferDestroy(dmabuf_fd);

...

I use the first channel, ycbcr_split[0], to represent grayscale values. My test image is a black circle with ycbcr_split[0] values around 20 against a white background with ycbcr_split[0] values around 230.

Here is the weirdness. After repeated image captures, a row of the ycbcr_split[0] array will contain a block of sixty-four pixels with intensity of zero. It is always a block of 64 consecutive pixel values and it has nothing to do with the black circle. For small image sizes (64x48), the block of zeros appears in frame number 940 (+/- 22). For larger image sizes, they appear sooner: (640x480) at frame number 548+/-8 and (1280x960) at frame number 186+/-1 frame.

The entirety of the rest of the array (outside of the blocks of zeros) is good. The image data written to a JPEG image is good. If I check the array ycbcr_split[0] for a block of 64 zeros and repeat the conversion (on the exact same NvBuffer data), it will eventually return an array without the zeros. It may take one additional conversion, it might take a few hundred.

These blocks containing zeros are killing my ability to detect the black circle with a simple (and fast) algorithm. Any thoughts as to what is causing this issue are greatly appreciated.

DaneLLL · May 17, 2017, 1:52am

Hi,
Does it happen in dumped YUVs(mapped data_mem) also? It happens in consecutive YUVs or single YUV? Please give more information so that we can try to reproduce it.

numbersnerd · May 17, 2017, 2:30am

I have not looked at data_mem. Perhaps it is straightforward, but I don’t immediately know where a particular value in ycbcr_split[0] resides in data_mem. I definitely should have tried to pinpointed the problem to mmap or the cv::Mat conversion.

This behavior frequently happens in consecutive YUVs. I just ran the program now with an image size of 640x480. In the first 552 image captures, no blocks of zeros appeared in ycbcr_split[0]. Cycle number 553 contained (at least) one block of 64 zeros. After re-running the for-loop on the same NvBuffer data 51 times, ycbcr_split[0] no longer contained a block of zeros. Here are the number of ADDITIONAL times the for-loop containing mmap and cv::Mat had to be run to obtain no blocks of zero in cycle 554 and beyond:

76, 0, 38, 13, 4, 0, 1, 0, 0, 41, 6, 3, 15, 0, 87, 0, 0, 0, …

It can happen consecutively, but there is no obvious pattern. The only other information that might be helpful is that the blocks of zeros appear only in discrete rows: 0-63, 64-127, 128-191, … In other words, I have never seen a case where the block of 64 zeros start at, say, row 10 and goes to row 73 of ycbcr_split[0].

If there is something more specific that would help you reproduce this issue, please let me know. I appreciate any help resolving this issue.

DaneLLL · May 19, 2017, 1:32am

Hi,
It would be great if you can share us a sample to reproduce the issue.
Or please advise which sample you run to add the mmap() code.

numbersnerd · May 19, 2017, 2:56am

Gladly. Below is the bare bones of the code. I kept most comments but got rid of my error checking and diagnostic prints. It captures a frame, uses mmap and cv::Mat to place the pixel values into a Mat array, and then does some calculations.

For a test image, I placed in front of the built in camera a black circle drawn on white paper. The calculation finds the minimum pixel intensity value in each row (and column). If the minimum intensity value in a column is below a threshold (of 75), that row (column) is assumed to contain the black dot. Once the rows and columns of the black dot are determined, it’s intensity weighted center of mass is computed.

The output gives, in order, (1) the first column containing the object, (2) the x-position of the center of mass (units of pixels), and (3) the last column containing the object. Similarly for the rows.

On my TX1, it goes haywire around 550 captures.

#include <stdio.h>
#include <stdlib.h>
#include <Argus/Argus.h>
#include <EGLStream/EGLStream.h>

#include <EGLStream/NV/ImageNativeBuffer.h>
#include <nvbuf_utils.h>
#include <NvUtils.h>

#include <opencv2/opencv.hpp>
#include <opencv2/core.hpp>
#include <opencv2/imgproc.hpp>
#include <opencv2/highgui.hpp>
#include <sys/mman.h>


#define EXIT_IF_NULL(val,msg)   \
        {if (!val) {printf("%s\n",msg); return EXIT_FAILURE;}}
#define EXIT_IF_NOT_OK(val,msg) \
        {if (val!=Argus::STATUS_OK) {printf("%s\n",msg); return EXIT_FAILURE;}}

using namespace std;
using namespace Argus;

int main(int argc, char** argv)
{
	const uint64_t FIVE_SECONDS_IN_NANOSECONDS = 5000000000;

	std::vector<Argus::CameraDevice*> cameraDevices;

	vector<cv::Mat> ycbcr_split(3);

	/*
	 * Set up Argus API Framework, identify available camera devices, and create
	 * a capture session for the first available device
	 */

	Argus::UniqueObj<CameraProvider> cameraProvider(CameraProvider::create());

	Argus::ICameraProvider *iCameraProvider = Argus::interface_cast<Argus::ICameraProvider>(cameraProvider);
	EXIT_IF_NULL(iCameraProvider, "Cannot get core camera provider interface");

	Argus::Status status = iCameraProvider->getCameraDevices(&cameraDevices);
	EXIT_IF_NOT_OK(status, "Failed to get camera devices");
	EXIT_IF_NULL(cameraDevices.size(), "No camera devices available");

	Argus::UniqueObj<Argus::CaptureSession> captureSession(iCameraProvider->createCaptureSession(cameraDevices[0], &status));

	Argus::ICaptureSession *iCaptureSession = Argus::interface_cast<Argus::ICaptureSession>(captureSession);
	EXIT_IF_NULL(iCaptureSession, "Cannot get Capture Session Interface");

	/*
	 * Creates the stream between the Argus camera image capturing
	 * sub-system (producer) and the image acquisition code (consumer).  A consumer object is
	 * created from the stream to be used to request the image frame.  A successfully submitted
	 * capture request activates the stream's functionality to eventually make a frame available
	 * for acquisition.
	 */

	Argus::UniqueObj<Argus::OutputStreamSettings> outputStreamSettings(iCaptureSession->createOutputStreamSettings());

	Argus::IOutputStreamSettings *iOutputStreamSettings = Argus::interface_cast<Argus::IOutputStreamSettings>(outputStreamSettings);
	EXIT_IF_NULL(iOutputStreamSettings, "Cannot get OutputStreamSettings Interface");
	iOutputStreamSettings->setPixelFormat(Argus::PIXEL_FMT_YCbCr_420_888);
	iOutputStreamSettings->setResolution(Argus::Size(640, 480));

	/* parameters */
	uint nImageWidth = 640;
	uint nImageHeight = 480;
	uint maxCount = 1000; // maximum count value

	/* set image resolution to stream */
	iOutputStreamSettings->setResolution(Argus::Size(nImageWidth, nImageHeight));

	Argus::UniqueObj<Argus::OutputStream> outputStream(
		iCaptureSession->createOutputStream(outputStreamSettings.get()));

	Argus::IStream *iOutputStream = Argus::interface_cast<Argus::IStream>(outputStream);
	EXIT_IF_NULL(iOutputStream, "Cannot get OutputStream Interface");

	Argus::UniqueObj<EGLStream::FrameConsumer> consumer(EGLStream::FrameConsumer::create(outputStream.get()));

	EGLStream::IFrameConsumer *iFrameConsumer = Argus::interface_cast<EGLStream::IFrameConsumer>(consumer);
	EXIT_IF_NULL(iFrameConsumer, "Failed to initialize Consumer");

	Argus::UniqueObj<Argus::Request> request(iCaptureSession->createRequest(Argus::CAPTURE_INTENT_STILL_CAPTURE));

	Argus::IRequest *iRequest = Argus::interface_cast<Argus::IRequest>(request);
	EXIT_IF_NULL(iRequest, "Failed to get capture request interface");

	status = iRequest->enableOutputStream(outputStream.get());
	EXIT_IF_NOT_OK(status, "Failed to enable stream in capture request");

	// indicate repeat capture
	iCaptureSession -> repeat(request.get());

	// get the start time
	double startTime = clock();

	// initialize counter of frame captures
	int iCount = 0;

	// create an array to hold the object center positions
	double tCenter[maxCount] = {0.0};
	double xCenter[maxCount] = {0.0};
	double yCenter[maxCount] = {0.0};

	while(iCount <= maxCount)
	{
		// issue capture request
		uint32_t requestId = iCaptureSession->capture(request.get());
		EXIT_IF_NULL(requestId, "Failed to submit capture request");

		/*
		* Acquire a frame generated by the capture request, get the image from the frame
		* and create a .JPG file of the captured image
		*/

		Argus::UniqueObj<EGLStream::Frame> frame(iFrameConsumer->acquireFrame(FIVE_SECONDS_IN_NANOSECONDS, &status));

		EGLStream::IFrame *iFrame = Argus::interface_cast<EGLStream::IFrame>(frame);
		EXIT_IF_NULL(iFrame, "Failed to get IFrame interface");

		// get the image from the frame
		EGLStream::Image *image = iFrame->getImage();
		EXIT_IF_NULL(image, "Failed to get Image from iFrame->getImage()");

		/* write image data to NvBuffer */
		EGLStream::NV::IImageNativeBuffer *iImageNativeBuffer = interface_cast<EGLStream::NV::IImageNativeBuffer>(image);
		EXIT_IF_NULL(iImageNativeBuffer,"Failed to create an IImageNativeBuffer");

		int dmabuf_fd = iImageNativeBuffer->createNvBuffer(Size {nImageWidth, nImageHeight}, NvBufferColorFormat_YUV420, NvBufferLayout_Pitch, &status);
		EXIT_IF_NOT_OK(status,"Failed to create NvBuffer")

		NvBufferParams params;

		int ret = NvBufferGetParams(dmabuf_fd, &params);

		if(ret < 0)
		{
			printf("Failed to get native buffer parameters\n"); return EXIT_FAILURE;
		}


		/*
		*	convert image data in the NvBuffer to YUV image data
		*/

		for(int i = 0 ; i<params.num_planes ; i++)
		{
			int32_t width = params.width[i];
			int32_t height = params.height[i];
			int32_t pitch = params.pitch[i];

			size_t fsize = pitch*height;

			uint8_t* data_mem = (uint8_t*)mmap(0, fsize, PROT_READ | PROT_WRITE, MAP_SHARED, dmabuf_fd, params.offset[i]);

			ycbcr_split[i] = cv::Mat (height, width, CV_8UC1, data_mem, pitch);
		}

		// free memory for NvBuffer
		NvBufferDestroy(dmabuf_fd);


		/*
		* convert raw YUV data to grayscale pixel values
		*/

		// use the first channel of the YUV values as a substitute for grayscale values
		cv::Mat grayScaleValues = ycbcr_split[0].clone();


		// define vectors to hold column and row-wise maximum pixel values
		cv::Mat minValuesByColumn;
		cv::Mat minValuesByRow;

		// define variables for the last column and row numbers for iterating
		int lastColumn = nImageWidth - 1;
		int lastRow = nImageHeight - 1;

		// determine minimum pixel values in each column
		cv::reduce(grayScaleValues,minValuesByColumn,0,CV_REDUCE_MIN,-1);

		// determine minimum pixel values in each row
		cv::reduce(grayScaleValues,minValuesByRow,1,CV_REDUCE_MIN,-1);

		// check for a zero row in the grayScaleValues array
		cv::Mat minValuesInGray;
		cv::reduce(minValuesByColumn,minValuesInGray,1,CV_REDUCE_MAX,-1);


		/*************************************************/
		/*
		/* begin searching through pixels for object
		/*
		/*************************************************/

		// initialize values
		int iColStart = 0; // column number where the object starts
		int iColEnd = 0; // column number were the object ends

		int thresholdPixelValue = 75;

		/*
		*
		* determine the columns containing the object
		*
		*/

		for(int iC = 0 ; iC <= lastColumn ; iC++)
		{
			if(minValuesByColumn.at<uint8_t>(0,iC) <= thresholdPixelValue)
			{
				// a pixel from the object was detected
				iColStart = iC;
				break;
			}
		}


		// set the end column to the current column
		iColEnd = iColStart;

		// only continue if a pixel from the object was detected before the last column
		for(int iC = iColStart ; iC <= lastColumn ; iC++)
		{
			if(minValuesByColumn.at<uint8_t>(0,iC) <= thresholdPixelValue)
			{
				// still part of the object
				iColEnd = iC;
			}
			else
			{
				break;
			}
		}


		/*
		*
		* determine the rows containing the object
		*
		*/

		int iRowStart = 0; // row number where the object starts
		int iRowEnd = 0;

		for(int iR = 0 ; iR <= lastRow ; iR++)
		{
			if(minValuesByRow.at<uint8_t>(0,iR) <= thresholdPixelValue)
			{
				// a pixel from the object was detected
				iRowStart = iR;
				break;
			}
		}


		// set the end column to the current column
		iRowEnd = iRowStart;

		// only continue if a pixel from the object was detected before the last column
		for(int iR = iRowStart ; iR <= lastRow ; iR++)
		{
			if(minValuesByRow.at<uint8_t>(0,iR) <= thresholdPixelValue)
			{
				// still part of the object
				iRowEnd = iR;
			}
			else
			{
				break;
			}
		}


		/************************************************/
		/*
		/* compute object center of mass
		/*
		/************************************************/

		double pixelIntensity = 0.0;
		double columnPositionTimesIntensity = 0.0;
		double rowPositionTimesIntensity = 0.0;

		for(int iC = iColStart ; iC <= iColEnd ; iC++)
		{
			for(int iR = iRowStart ; iR <= iRowEnd ; iR++)
			{
				pixelIntensity += grayScaleValues.at<uint8_t>(iR,iC);
				columnPositionTimesIntensity += grayScaleValues.at<uint8_t>(iR,iC)*iC;
				rowPositionTimesIntensity += grayScaleValues.at<uint8_t>(iR,iC)*iR;
			}
		}

		// save current values
		tCenter[iCount] = (clock()-startTime)/1000000.0; // time in seconds
		xCenter[iCount] = columnPositionTimesIntensity/pixelIntensity; // x-position center of mass
		yCenter[iCount] = rowPositionTimesIntensity/pixelIntensity; // y-position center of mass

		if(tCenter[iCount] < 10)
		{
			printf("For image %i at  %f s: Column center = [%i\t%f\t%i] and row center = [%i\t%f\t%i]\n",iCount,tCenter[iCount],iColStart,xCenter[iCount],iColEnd,iRowStart,yCenter[iCount],iRowEnd);
		}
		else
		{
			printf("For image %i at %f s: Column center = [%i\t%f\t%i] and row center = [%i\t%f\t%i]\n",iCount,tCenter[iCount],iColStart,xCenter[iCount],iColEnd,iRowStart,yCenter[iCount],iRowEnd);
		}


		/* Original oneShot code to write image data to a file */
		EGLStream::IImageJPEG *iImageJPEG = Argus::interface_cast<EGLStream::IImageJPEG>(image);
		EXIT_IF_NULL(iImageJPEG, "Failed to get ImageJPEG Interface");

		if(iCount == maxCount)
		{
			status = iImageJPEG->writeJPEG("oneShot.jpg");
			EXIT_IF_NOT_OK(status,"Failed to write JPEG");
		}

		if(iCount == 0)
		{
			// establish the autocontrol object
			//Argus::UniqueObj<Argus::Request> autoControlSettings(iRequest->getAutocontrolSettings());

			// get autocontrol interface
			Argus::IAutoControlSettings *iAutoControlSettings = Argus::interface_cast<Argus::IAutoControlSettings>(iRequest->getAutoControlSettings());

			// set autocontrol settings
			iAutoControlSettings -> setAeLock(true);
			iAutoControlSettings -> setAwbLock(true);
		}

		iCaptureSession->repeat(request.get());

		iCount++;
	}

	iCaptureSession -> stopRepeat();

	printf("Captured %i frames in %f seconds at a rate of %f frames/sec\n", iCount-1, (clock()-startTime)/1000000.0,(iCount-1)/(clock()-startTime)*1000000.0);

	return EXIT_SUCCESS;
}

DaneLLL · May 24, 2017, 9:26am

We will have new functions in next TX1 release:
[url]msync with MS_SYNC option failure - Jetson TX1 - NVIDIA Developer Forums

numbersnerd · May 24, 2017, 11:19am

Thank you for the response. Do you have any idea that these new functions address this problem? Otherwise, this is just more kicking the can down the road.

More importantly, is there an estimated release date?

Lastly, these functions will be part of what exactly, the next L4T? Jetpack? …?

DaneLLL · May 25, 2017, 3:14am

Hi,
NvBufferMemSyncForCpu() should fix the issue. The estimated release date is in early July, if everything goes fine.

JetsonTX2Developer · August 11, 2017, 10:01am

Hi DaneLLL,

I also want to apply OpenCV functions to camera images, so I want to convert the NvBuffer to an OpenCV Mat array. I reproduced the above issue, but I don’t know how to apply below function to fix it:

int NvBufferMemSyncForCpu (int dmabuf_fd, unsigned int plane, void **pVirtAddr)

What is the **pVirtAddr parameter for the code example in post #5? Should I replace mmap by NvBufferMemMap? A code example would be great! Thanks in advance!

DaneLLL · August 14, 2017, 1:32am

Hi,
Here is a post about Argus + openCV
https://devtalk.nvidia.com/default/topic/1010111/jetson-tx1/nvmm-memory/post/5162049/#5162049

Please refer to it.

An example of NvBufferMemMap():

for(int i = 0 ; i<params.num_planes ; i++)
{
    void* data_mem;
    NvBufferMemMap(dmabuf_fd, i, NvBufferMem_Read_Write, &data_mem);
}

numbersnerd · September 6, 2017, 1:10pm

I installed the new Jetpack and found the function definitions for NvBufferMemSyncForCpu() and others that should resolve the issue that I found, but I’m still at a loss as to how to implement them. The documentation continues to be largely unhelpful, and none of the sample codes invoke these functions. (The latter point makes me think mmap to nvbuffer to cv object may not be the best approach to access the individual pixel intensities from a captured image.)

Following post #10, I placed the line

int ret = NvBufferMemSyncForCpu(dmafuf_fd, i, &data_mem)

after line 13 in the snipet taken from post #5:

/*
		*	convert image data in the NvBuffer to YUV image data
		*/

		for(int i = 0 ; i<params.num_planes ; i++)
		{
			int32_t width = params.width[i];
			int32_t height = params.height[i];
			int32_t pitch = params.pitch[i];

			size_t fsize = pitch*height;

			uint8_t* data_mem = (uint8_t*)mmap(0, fsize, PROT_READ | PROT_WRITE, MAP_SHARED, dmabuf_fd, params.offset[i]);

			ycbcr_split[i] = cv::Mat (height, width, CV_8UC1, data_mem, pitch);
		}

The compiler error was invalid conversion from uint8_t** to void **.

I feel like I’m throwing darts in the dark since I still do not understand the underlying problem, and using other posts and any peripherally related to solve problems it by analogy is typically unproductive.

Isn’t getting at the individual pixel intensities of an image fairly straightforward!?! Any guidance as to how to proceed would be fantastic.

DaneLLL · September 7, 2017, 1:22am

Hi numbersnerd,
Have you tried [url]https://devtalk.nvidia.com/default/topic/1010111/jetson-tx1/nvmm-memory/post/5162049/#5162049[/url] ? It demonstrates how to get cv::Mat buffers.

numbersnerd · September 7, 2017, 2:29am

Hello DaneLLL, and thank you for your response.

Unfortunately, your response seems to contradict previous posts, or I am just misunderstanding. The original issue was raised after the included code (post #5) was used to get pixel intensity values into a cv::Mat object. That process worked. The problem was in the pixel intensity data containing rows with phantom groups of zeros in blocks of 64.

The solution (post #8) was to sit tight (for months, mind you) and wait for the Jetpack release. The post you reference above (#12) seems to have nothing to do with NvBufferMem…() functions but suggests a different approach that requires figuring out that back and forth in that post.

Please clarify how you are suggesting to solve the problem described herein: (1) figuring out NvBufferMemSyncForCpu() or (2) adapting the approach of post #12?

If it is the first, please see post #11 as I believe the documentation for NvBufferMem…() functions is too cryptic to be useful. If it is the second, I could use clarification as to what in those codes should solve this problem.

DaneLLL · September 7, 2017, 2:54am

Please call NvBufferMemMap() to replace mmap(), and then NvBufferMemSyncForCpu().