After cudaMalloc, the address is virtual or physical?

tianyu9748 · October 13, 2022, 7:09pm

I am a rookie to CUDA. Here are questions I have in the project.
cudaMalloc(&d_x, N*sizeof(float)) By cudaMalloc,it should be on the device?
printf("Address: %p\n", d_x) output is: 0x7fad1b200000.
My question is:

Is this address virtual or physical? If virtual, how could I based it get the physical address?
If I don’t call cudaFree, will this data keep residing here, or cudaFree will be called automatically?
Is it possible to store parameters of the model to somewhere of the global memory after the end of training/inference?

Best
Max

Robert_Crovella · October 13, 2022, 8:23pm

yes

virtual

you can’t

The data will keep residing there until the owning process terminates (at that point “cudaFree is called automatically”).

Sure, typical frameworks like TF and pytorch give the user the ability to locate a model on host or device, and move the model from host to device or vice versa. I suggest asking questions about how to do things in frameworks in other forums, like this one for pytorch.

tianyu9748 · October 13, 2022, 8:30pm

Thank you so much for the help! :)

tianyu9748 · October 13, 2022, 8:46pm

Hi Robert,
I comes with two extra questions, hope it won’t take you too long.

If I enable the MIG for A100 and create two 2-gpcs instances, g1 and g2, do they (g1,g2 and the rest part of A100) also share the virtual memory space?
In MIG documentation, P2P is disabled under MIG mode, any communication between instances, like g1 and g2, has to go through the host memory? What about the communication for g1 and the rest of A100?

Best
Max

Robert_Crovella · October 13, 2022, 10:00pm

Every processor in the system shares the same unified virtual address space, when UVA is in effect. UVA is in effect any time you are using an A100 GPU, regardless of usage of MIG or not, or any other possible variation in how you may be using the A100.

Yes. (It is possible to do CUDA IPC between compute instances, but ordinary cudaMemcpyXXX operations between MIG instances would not follow a P2P path.)

When you enable MIG mode, you must define MIG instances for any usability of that A100. It is not possible to use MIG instances but somehow leave the “rest of A100” for “ordinary use”. The only thing that gives you any functionality at that point are defined MIG instances. Undefined space on the A100 is not usable, when MIG mode is enabled. Once you enable MIG mode, P2P is not supported for any sort of usage of any part of the A100.

tianyu9748 · October 13, 2022, 11:46pm

Appreciate it. It helps a lot lot lot.

jjjjjjjj · May 31, 2024, 4:37am

Hi Robert,
As you said, if the memory address from cudaMalloc is a virtual address, when and where does the memory address transition from virtual to physical? And where is the Memory Management Unit (MMU) that supports memory translation placed? If you have any related documents, please let me know.

Robert_Crovella · May 31, 2024, 4:50am

Sorry, I won’t be able to share with you the hardware design of the GPU chip. The GPU has a MMU. The address provided to the programmer from cudaMalloc is a virtual address. The CUDA programmer has no access to, nor any need of, the physical address, nor does the CUDA programmer need to know exactly where the MMU is placed on the chip or in the device address hierarchy. I don’t know of any related published documents.

striker159 · May 31, 2024, 6:39am

Not directly an answer to the question, but a quick google search finds this nvidia document regarding virtual memory and page tables for the Pascal architecture

Topic		Replies	Views
Is cudaMallocHost allocated physical memory? CUDA Programming and Performance	6	1245	July 15, 2020
Introducing Low-Level GPU Virtual Memory Management Technical Blog	59	9001	June 4, 2024
High virtual memory comsumption CUDA Programming and Performance	1	672	September 20, 2011
Virtual address and Physical address in GPU CUDA Programming and Performance	0	871	June 3, 2016
Question about CUDA memory allocation CUDA Programming and Performance	2	7958	May 15, 2011
What happens to cudamalloc() + atomicAdd()? CUDA Programming and Performance	4	458	December 29, 2021
Cuda malloc -- 2 regions of GPU memory? CUDA Programming and Performance	3	849	November 22, 2010
How to allocate contiguous GPU memory and get its physical address in latest CUDA? CUDA Programming and Performance cuda , kernel , ubuntu	6	1891	September 19, 2023
Physical address of pinned memory CUDA Programming and Performance	0	720	March 6, 2014
Allocate gpu memory at specific virtual address on Nvidia Xavier board Jetson AGX Xavier cuda	8	782	April 20, 2022

After cudaMalloc, the address is virtual or physical?

Related topics