What is the correct way to use cudaMallocHost to create a local array representing the GPU data?

mm_j · August 20, 2024, 7:21am

If I am creating memory equally on the host and GPU like this:

T* cpuSide = NULL;
T* gpuSide = NULL;

size = 1024;

cpuSide = new T[size]; //bad method as per  below link

cudaStatus = cudaMalloc((void**)&gpuSide, size * sizeof(T));

This seems to work fine. I can run memcopy on these and they work fine.

However, as per this link, this is the wrong way to create host memory: cuda - cudaMemcpy() calls to streams - Stack Overflow

They suggest we should instead do:

T* cpuSide = NULL;
T* gpuSide = NULL;

size = 1024;

cudaStatus = cudaMallocHost((void**)&cpuSide, size * sizeof(T));

cudaStatus = cudaMalloc((void**)&gpuSide, size * sizeof(T));

However, when I do this, I end up with memory errors on usage:
0xC0000005: Access violation reading location 0x00000002042003EC.

I presume this is not creating my array correctly on the local host. So what do I need to fix? Thanks for any help.

Robert_Crovella · August 20, 2024, 1:58pm

That’s a bit extreme. It is perfectly valid. In some situations, using cudaMallocHost as an alternative may be necessary (e.g. to achieve copy/compute overlap) or preferred for other reasons.

There is not enough information here to diagnose. The two variants (with new and with cudaMallocHost) should be roughly equivalent from a “legal access” perspective, so something else is going on. If you are on windows and requesting a large amount of space (which you aren’t) via cudaMallocHost, that can be an issue. Also, be sure you are actually checking those cudaStatus results. If you still need help, provide a complete example, along with the CUDA version, the GPU, and the OS you are running on.

mm_j · August 20, 2024, 10:54pm

Thanks. I just realized it was because I had a delete[] cpuAllocation I had left in there and hadn’t replaced. My mistake. I will just stick with the cudaMallocHostas it works and sounds safer going forward. Thanks.

system · September 3, 2024, 10:54pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Use of cudaMallocHost Segmentation fault CUDA Programming and Performance	4	7325	July 3, 2007
Problem CudaMallocHost CUDA Programming and Performance	4	2173	July 14, 2015
Dynamic Memory Allocation on the Host CUDA Programming and Performance	6	1990	May 26, 2010
About cudaMemcpy() CUDA Programming and Performance	2	1424	August 1, 2017
cudaMallocHost confusion CUDA Programming and Performance	6	9979	June 24, 2011
Memeory allocation on Host Memory allocation to Host to Device Transfer CUDA Programming and Performance	2	1417	December 10, 2009
Copy array from host to device in CUDA CUDA Programming and Performance	1	4619	February 28, 2017
CUDA class - allocate memory using malloc (Dynamic Global Memory Allocation and Operations) CUDA Programming and Performance	3	3216	February 2, 2017
cudaMallocHost How to use CUDA Programming and Performance	6	35769	April 26, 2012
Simple cudaMallocHost beginner question CUDA Programming and Performance	5	2812	September 29, 2008

What is the correct way to use cudaMallocHost to create a local array representing the GPU data?

Related topics