cuda memory management in multi-gpu programming

cheer37 · December 4, 2014, 4:08am

I am developing the multi-gpu cuda program.
I have some questions relating to the memory management.

Is there unified memory to share between cuda enabled gpus?
as i know, new developed unified memory addressing is for between gpu and cpu.
How can I make local variable in cuda kernel or global variable to be unified memory between cpu and gpu? when not using cudaMallocManaged.
local variable in cuda kernel and global variable is in which gpu when several gpu exists?
and also, how to set them be in the specific gpu?
Thanks in advance.

Robert_Crovella · December 4, 2014, 5:18am

Yes. If you read the programming guide section on Unified Memory, there is a whole section on behavior in a multi-GPU environment.
I assume by saying “not using cudaMallocManged” you mean “I don’t want to use Unified Memory”. If you then want a variable to be “unified memory between cpu and gpu” (your question is a bit confusing) then you might also want to look at using cudaHostAlloc. If you’re simply asking how to create a variable in UM without using cudaMallocManaged, the other option is statically via device managed
if you declare a statically allocated device variable, then a separate variable will be instantiated in each GPU that has been initialized in the CUDA context. The one you actually access will be determined by:

in device code, where the kernel is executing
in host code, which is the most recent cudaSetDevice() call prior to cudaMemcpyTo/FromSymbol()

Otherwise, if you dynamically allocate via cudaMalloc, then the variable will be instantiated on whichever GPU was identified in the most recent call to cudaSetDevice(). Local variables are local to the execution of whatever kernel they are in, so they are local to whatever device that kernel is executing on.

You’ll find answers to questions like these in the programming guide:

[url]http://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#abstract[/url]

cheer37 · December 4, 2014, 12:27pm

Thank you txbob.
for example, suppose that xxx.cu is configured.

device int nGlobalVariable;
device int nGlobalArray[3] = {1, 2, 3};
In this case, nGlobalVariable and nGlobalArray lay in which gpu?
And also, how can i notify them to lay in specific gpu memory?

Robert_Crovella · December 4, 2014, 2:46pm

They are in every GPU that was present and visible when the cuda context was initialized (separate copies in each GPU).

If you want them to be only on a single GPU, do not allocate statically, instead use cudaSetDevice to select the GPU you want, then use cudaMalloc

cheer37 · December 4, 2014, 5:42pm

Oh, Very thanks, It’s what i want to know.

Topic		Replies	Views
multiple gpu and unified memory CUDA Programming and Performance	3	4772	March 29, 2022
Unified memory with multiple GPUs and no P2P CUDA Programming and Performance cuda	5	622	January 9, 2025
Memory allocation problem with multi-gpu (Tesla k80), possible cuda driver bug CUDA Programming and Performance	5	4159	February 20, 2016
Are GPU allocated pointers unique? CUDA Programming and Performance	5	918	January 4, 2018
Using unified memory in dynamic parallelism Jetson TX1	1	637	April 6, 2016
Constant memory on multiple GPU in unified memory environment CUDA Programming and Performance	1	832	March 8, 2012
Multi GPU and utilization Nsight Visual Studio Edition cuda	0	520	October 27, 2021
Problems with Unified Memory Under Pascal CUDA Programming and Performance	2	1141	January 24, 2017
Unified Memory in CUDA 6 Technical Blog	87	2751	August 16, 2019
Unified memory for CC 6.1 CUDA Programming and Performance	4	1505	December 6, 2016

cuda memory management in multi-gpu programming

Related topics