Does CUDA unified memory solve data movement issues on newer GPUs?

reikdas · April 26, 2024, 6:29pm

How does CUDA unified memory handle data movement?

Let’s say I have a program that looks like this -

cudaMallocManaged(&y, ...);
host_kernel(y);
device_kernel(y);
host_kernel(y);
device_kernel(y);
host_kernel(y);
device_kernel(y);

Will I take a performance hit due to data movement between device and host kernels?

If it is dependent on the GPU, what generation of GPUs does this start becoming efficient? Would Compute Capability 6+ (Pascal) suffice? https://developer.nvidia.com/blog/unified-memory-cuda-beginners/#what_happens_on_pascal_when_i_call_cudamallocmanaged

For better performance should I come up with an algorithm to only allocate the amount of memory that would fit on the GPU and then allocate the next batch? c++ - CUDA - Unified memory (Pascal at least) - Stack Overflow

Robert_Crovella · April 26, 2024, 6:45pm

system · May 10, 2024, 6:46pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Bad performance when using unified memory CUDA Programming and Performance	2	3410	April 21, 2019
cuda unified memory: memory transfer behaviour CUDA Programming and Performance	1	591	August 12, 2016
How intelligent is the Unified Memory runtime CUDA Programming and Performance	3	986	May 6, 2014
Unified memory CUDA Programming and Performance	2	746	November 11, 2019
CPU vs GPU performance CUDA Programming and Performance	3	486	December 16, 2018
cudaStreamAttachMemAsync behavior questions GPU-Accelerated Libraries	0	1676	September 19, 2016
Does Cuda Unified Memory let multiple GPUs access randomly on non-overlapping regions of host array, concurrently? CUDA Programming and Performance	6	2333	March 30, 2018
Question on working of CUDA Unified Memory CUDA Programming and Performance cuda	1	581	December 6, 2021
Does Pascal Unified Memory, mentioned in pascal whitepaper, supported now? CUDA Programming and Performance	9	1682	April 14, 2017
How can I change the memory copy pattern with Unified Memory? CUDA Programming and Performance	2	692	May 15, 2017