Unaccounted memory consumption while running the kernel

hamidreza2 · August 29, 2023, 1:40pm

Hi,
I have a kernel which runs specific set of calculations for many points. These calculations on each thread (point) needs large amount of temporary variables (~ a double array of 5000). Therefore, I can not use shared memory for these variables. If I use global memory and allocate necessary space for them, considering the number of points I have, it exceeds the capacity of my GPU. Therefore, I have used local variables inside kernel and device functions to avoid out of memory. I also stored the results in the global memory, allocated before kernel execution.

I monitor how much RAM the GPU uses both before and after the kernel is run. Before execution, the memory utilization is in line with my global memory allocation. But, after running the kernel it jumps suddenly (around 7 GB) and even after finishing the execution of it, my code does not release the memory. I have checked for memory leaks and I could not found any. Also, I have tested my code using memcheck and there was no problem.

It is very interesting that the size of this excess memory does not depend on the amount of points and it is always constant.

Anyone has information on this matter? is there any mechanism in cuda which needs and uses this memory?

Thanks in advance.

Robert_Crovella · August 29, 2023, 2:15pm

local memory is stored in the same physical backing (GPU DRAM memory) as the logical global space. Therefore a large local allocation per thread will use up a large amount of this space. The amount will be determined by the size of the per-thread allocation and the characteristics of your GPU, which is why it appears to be “constant”.

The memory is not immediately released when your kernel finishes. It will/should be released when your application finishes.

Topic		Replies	Views
Global memory "leak" after register spill? CUDA Programming and Performance cuda , kernel	16	304	September 3, 2024
Why does a simple single-threaded CUDA kernel consume large amounts of global memory? CUDA Programming and Performance	7	6705	February 24, 2011
global memory lost CUDA Programming and Performance	9	6663	November 4, 2010
Cuda - Out of Memory Area CUDA Programming and Performance cuda	3	458	November 9, 2023
Management of global memory used by kernels Profiling Linux Targets	7	336	April 29, 2025
temporary memory issues CUDA Programming and Performance	11	5504	March 30, 2008
Out of memory when allocating local memory CUDA Programming and Performance	4	1000	January 4, 2023
CUDA memory leak in sin / cos implementation (CUDA 3.0)? local memory not freed after kernel exits CUDA Programming and Performance	4	6381	August 17, 2010
Global memory occupied until cudaDeviceReset() or app exits CUDA Programming and Performance	0	2550	June 25, 2014
Question about variables inside a kernel CUDA Programming and Performance	5	2449	January 22, 2008

Unaccounted memory consumption while running the kernel

Related topics