How to pre-allocate free space list of cudaMalloc?

alex.thomasv · February 15, 2023, 7:57pm

Can we force all allocations from cudaMalloc to be a specific virtual address space range? Is there a way we can pre-allocate any calls to CudaMalloc inside the kernel to only allocate in this pre-allocated buffer?

njuffa · February 15, 2023, 9:22pm

To my knowledge: no and no. That is hardly surprising, malloc-type allocators are in general not designed for this level of control.

It is no clear what you are trying to accomplish, but generally speaking, if an application needs very specific behavior from an allocator, the standard approach is to allocate one contiguous (in virtual address space) chunk of memory from the standard allocator (e.g. malloc or cudaMalloc) at application startup, then use an application-specific sub-allocator that implements whatever properties are desired for that chunk of memory.

From practical experience (I have done this multiple times for different types of apps), programming and testing a simple sub-allocator based on a free list takes a couple of hours. You could also look into creating a memory pool, a slab allocator, or a buffer ring, depending on what you are trying to accomplish.

Robert_Crovella · February 15, 2023, 10:09pm

CUDA provides a mechanism to manage virtual locations of device memory allocations, but to my knowledge this has no bearing on in-kernel usage of malloc, new, or cudaMalloc.

If you needed to do this inside the kernel, I don’t know of any solutions other than roll-your-own allocator (i.e. what njuffa described)

alex.thomasv · February 16, 2023, 4:50am

Thank you.

That makes sense, I was thinking of using cuMemMap + address reserve to create a big chunk and to force allocation at a specific virtual address range. How about the code section/data section in the binary? Can we control where that is mapped to? I want to force the code section to be at a specfic virtual address.

Robert_Crovella · February 16, 2023, 3:46pm

I’m not aware of any method to do that.

alex.thomasv · February 21, 2023, 11:06pm

In that case, is there a way to figure out where the code/stack is mapped to?

Robert_Crovella · February 21, 2023, 11:12pm

I’m not aware of any method to do that.

(You might be able to do something hacky like get a pointer to a device function in device code, then make some sort of guesses based on that. However from my perspective this topic never comes up in CUDA programming that I am familiar with, so I don’t know the intent here.)

alex.thomasv · February 21, 2023, 11:15pm

I just want to get a complete view of the memory allocations and where they are mapped to. I was able to find all UVM ranges that were allocated by looking at the open source GPU device driver. But I am unsure where the kernel is loaded to and whether its exposed with UVM or not.

njuffa · February 22, 2023, 3:54am

For what purpose? At this point this looks like an XY problem. You might get some relevant tips if you mention what it is you are actually trying to accomplish.

alex.thomasv · February 22, 2023, 4:02am

There is really no purpose (i guess for my own curiosity). I guess my end goal is to make a tool to visualize memory by making a memory map where it shows where things are allocated in GPU memory based on its virtual address.

Topic		Replies	Views
allocate virtual memory without commiting CUDA Programming and Performance	0	333	April 23, 2019
Question Dynamic Memory Allocation in the kernel function CUDA Programming and Performance	2	3649	November 30, 2009
On the implementation of CUDA_MALLOC CUDA Programming and Performance	0	972	July 8, 2010
Memory fragmentation CUDA Programming and Performance	5	6833	October 13, 2009
cudaMalloc from inside a kernel CUDA Programming and Performance	3	12745	September 2, 2009
Dynamic memory allocation during kernel execution Is it posible? CUDA Programming and Performance	13	169430	January 25, 2013
API support for changing device memory mappings CUDA Programming and Performance	6	2900	February 21, 2010
Device Memory Mangement CUDA Programming and Performance	14	3477	December 5, 2008
Allocate memory in the kernel execution CUDA Programming and Performance	5	1285	January 27, 2012
Dynamic memory allocation in a device function? CUDA Programming and Performance	2	2203	June 12, 2008

How to pre-allocate free space list of cudaMalloc?

Related topics