What is the maximum CUDA Stack frame size per Kerenl.

njuffa · November 18, 2013, 6:46pm

The compiler reports stack frame usage on a per-thread basis. The maximum stack frame size per thread for a given GPU is determined by (a) a hard architecture limit on the amount of local memory per thread (b) the amount of available GPU memory.

The architectural limit on the amount of local memory per thread is documented in the programming guide section G.1, table 12.
[url]http://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#features-and-technical-specifications[/url]

Available stack frame size per thread can then be approximated by

stack frame size available per thread =
min (amount of local memory per thread as documented in section G.1 table 12,
available GPU memory / number of SMs / maximum resident threads per SM)

The reason this is approximate is because there are various levels of allocation granularity that, best I know, are not documented and may vary from GPU to GPU. I do not know anything about your use case, but in general massive local memory usage would suggest to me that one might want to re-think the mapping of work to the GPU.

Topic		Replies	Views
cudaDeviceSetLimit bug CUDA Programming and Performance	6	176	January 21, 2025
Out of memory when allocating local memory CUDA Programming and Performance	4	1000	January 4, 2023
Some stack questions CUDA Programming and Performance	1	524	January 13, 2012
Maximum stack size? CUDA Programming and Performance	7	1499	March 24, 2024
Per thread local memory Per thread local memory specified in C Programming Guide CUDA Programming and Performance	1	904	March 6, 2012
A basic question about nvcc: stack frame, spilling ld/st CUDA Programming and Performance	8	3565	April 4, 2014
How CUDA driver set stack size on kernel invocation? CUDA Programming and Performance	0	1202	May 21, 2019
Array size upper bound in kernel CUDA Programming and Performance	3	866	December 2, 2021
cudaDeviceSetLimit call increases the GPU memory CUDA Programming and Performance	2	1168	September 28, 2016
show sizes of GPU memory usage, eg log cudaMalloc, CUDA reports "out of memory" at runtime CUDA Programming and Performance	4	2212	December 13, 2016

What is the maximum CUDA Stack frame size per Kerenl.

Related topics