CudaMallocManaged and CudaMemcpy

AkashNebhwani · May 27, 2016, 1:19pm

Hi All,

I have allocated some memory in unified memory i.e. cudaMallocManaged, now I have to copy some data into it.

A very basic question comes to my mind is that should I do a normal cpp memcpy or cudaMemcpy with my understanding both should work, but what is the difference between them???

I mean when I do a cudaMemcpy and we specify 4th argument as CudaMemcpyHostToDevice what does this do???

ROOT ISSUE: both memcpy works fine for my code in a single pthread…
but gives a bus error when two different kernels are called from different threads when I do a normal memcpy
whereas works fine in above condition when I do CudaMemcpy…!!

Need a very urgent suggestions/solution to this issue…!!!
Any help is appreciated…!!!

Thanks in advance…

Robert_Crovella · May 27, 2016, 1:40pm

The point of using cudaMallocManaged is that you don’t have to use cudaMemcpy

you might want to read the relevant section of the programming guide:

[url]Programming Guide :: CUDA Toolkit Documentation

and perhaps study some of the UM sample codes.

With respect to host operations, you can use ordinary host operations (e.g. memcpy) to populate it if you wish.

The bus error may be arising if you haven’t satisfied the UM requirement that no host thread is allowed to touch a managed data region after a kernel call until you explicitly do a cudaDeviceSynchronize(). If you use cudaMemcpy H->D on the other hand, you can still write to such a region. This synchronization requirement is covered in the programming guide:

[url]http://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#um-gpu-exclusive[/url]

AkashNebhwani · May 29, 2016, 3:02pm

Thanks txbob…!!! The link was of great help… And I think I got the issue where I’m going wrong…

Topic		Replies	Views
cudaMemcpy() behavior question CUDA Programming and Performance	4	6655	August 8, 2007
cudaMalloc() vs cudaMallocManaged() wrt to cudaMemcpy() CUDA Programming and Performance	9	4491	October 11, 2018
cudaMalloc & cudaMemcpy from different host threads Anything new concerning this issue CUDA Programming and Performance	0	3736	November 6, 2009
Multi-thread memset and memcpy CUDA Programming and Performance	2	3056	June 8, 2009
Are cudaMemCpy and cudaMalloc blocking/synchronous? CUDA Programming and Performance	1	143	September 30, 2024
cudaMalloced pointer in one thread not allow cudaMemcpy in another CUDA Programming and Performance	2	4623	April 13, 2011
cudaMemcpyAsync clarification required & help needed CUDA Programming and Performance	0	1749	October 17, 2009
Questions about cudaMalloc Questions about runtime for cudaMalloc and cudaMemcpy CUDA Programming and Performance	1	3335	June 23, 2009
A simple threading question Do memory copies have to occur in the device thread? CUDA Programming and Performance	4	4282	March 26, 2009
Syncing Mapped Memory (cudaHostAllocMapped) after cudaMemcpy(Device-Device) CUDA Programming and Performance	6	8058	January 11, 2011

CudaMallocManaged and CudaMemcpy

Related topics