zero copy and application killed by something

before using zero copy,i found that my application would be killed because i use of malloc cpu memory or memcpy from cpu to gpu frequently.
And now i use zero copy with these code without using cudaHostGetDevicePointer(&ans,res,0).

cudaHostAlloc((void **)&frameGpuData,  frameLen,  cudaHostAllocMapped);
	cuxxxx<<<1,128>>>(eglFrame.frame.pPitch[0], frameGpuData, image_width, image_height);//bgrbgrbgr

and then my application killed after runing a few moment.
so i want to ask for How to confirm what cause my application killed?

Hi,

You can check our document first:
[url]https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#page-locked-host-memory[/url]

Host pointer is returned by cudaHostAlloc() but the device pointer is retrieved by the cudaHostGetDevicePointer().
Thanks.

but i test that i can use Host pointer to do something gpu operation from cudaHostAlloc.

int zeroCopyTestGPU ()
{
	// Set flag to enable zero copy access
	cudaSetDeviceFlags(cudaDeviceMapHost);

	Mat bmp = imread ("./test.bmp");
	int frameLen = sizeof(unsigned char) * bmp.cols * bmp.rows * 3;
	/*-------gpudata-------*/
	unsigned char* BmpData;
	cudaMalloc((void**)&BmpData, frameLen);
	cudaMemcpy(BmpData,(unsigned char *)bmp.data,frameLen,cudaMemcpyHostToDevice);
	/*------------*/

	/*------zero copy---------*/
	ELAPSED_START(t1,t2,"cudaHostAlloc");
	unsigned char* frameAllocData;
	cudaHostAlloc((void **)&frameAllocData,  frameLen,  cudaHostAllocMapped);
	//printf("cpu zero ptr : %p\n", frameAllocData);
	ELAPSED_PRINT(t1,t2,"cudaHostAlloc");
	/*---------------*/

	//unsigned char* gpudata;
	//ELAPSED_START(t3,t4,"cudaHostGetDevicePointer");
	//cudaHostGetDevicePointer((void **)&gpudata,  (void *) frameAllocData , 0);
	//printf("cpu zero ptr : %p\n", gpudata);
	//ELAPSED_PRINT(t3,t4,"cudaHostGetDevicePointer");
	BGR2RGB(BmpData, frameAllocData,bmp.cols,bmp.rows);

	Mat bmp1(bmp.rows, bmp.cols, CV_8UC3, frameAllocData);
	cv::imwrite("kk.bmp", bmp1);
	//printf("kk\n");
	cv::cuda::GpuMat gpubmp1(bmp.rows, bmp.cols, CV_8UC3, frameAllocData);
	Mat bmp22;
	ELAPSED_START(t8,t9,"download");
	gpubmp1.download(bmp22);
	ELAPSED_PRINT(t8,t9,"download");
	cv::imwrite("kk1.bmp", bmp22);
	//printf("kk1\n");
	printf("============\n");
}

Hi,

Would you mind to check if the pointers of frameAllocData and cudaHostGetDevicePointer are identical?

Thanks.

Hi there, I have checked the pointers you mentioned and they are absolutely identical.What confuses me is that why my program gets killed, and I am certain that it is NOT because the RAM explodes. Could you tell me is there any logs I can check?

Thanks.

I assume that you run the zeroCopyTestGPU function more than once?
You call both cudaMalloc() and cudaHostAlloc() without deallocating those buffers.
Presumably, you leak those buffers, and run out of memory, and that’s why your program dies.

If you run the program through the debugger (like gdb) what does it say when it dies?

Hi,

It’s still recommended to use cudaHostGetDevicePointer for the device pointer.
It may happen to be identical but not always guaranteed.

Another possible reason to lead to killed error is that the concurrent access.
Have you added the synchronization call before accessing the buffer with another process?

Thanks.

Hi,
thanks for reminding. the zeroCopyTestGPU function just is my test function,is not the killed program.

Hi,
for reason of the concurrent access,I make sure that my program is have not the concurrent access.And I have tested my program for 24 hours,the killed situation is not happening.Thanks for your reply.I will check it continue.
Thanks