Question about variables inside a kernel

koby · January 16, 2008, 11:53am

Hello all!

I have create the following kernel:

global test(…) {

 ...
 unsigned char myVar[512];
 ...

}

In what kind memory is myVar allocated: registers, global, shared, or something else?

Thanks in advance!

AndreiB · January 16, 2008, 1:09pm

If indexing into your array may be precomputed at compile-time it is highly likely that it will be placed in registers. Otherwise it will be placed in local memory (i.e. per-thread part of global memory).

Generally compiler seems to maximize usage of registers where possible (since they’re fast), but registers are not addressable, so it’s impossible to have indexed array in them.

Also, arrays of smilar size may be allocated in local memory because it will requre too much registers otherwise.

Variables without shared specifier are neer allocated in shared memory.

Best way to answer your question is to try and check resulting .cubin file for actual resource (registers, shared and local) usage by kernel.

v01d · January 18, 2008, 5:40pm

I had the same doubt, so now I know. The problem I have relating to this, is that I need space (an array of floats, for example) to be used just inside the kernel (no transfers between host and device). I could think of a maximum size and declare it as a local variable (but this would limit input size of the program, etc). If I allocate the memory dynamically outside the kernel, I would be wasting less space. But since I imagine that local memory is optimized somehow (ie: it doesn’t use 512 * total threads, in the previous example, since there aren’t “total threads” executing simultaneously, just the maximum allowed in one warp).
So, any way to allocate memory dynamically outside the kernel call, and specify it will be used as local memory?

AndreiB · January 18, 2008, 6:33pm

Local memory is as slow as global memory, so you should avoid using it where possible. Check if you can benefit from using shared memory in your kernel.
I’m not aware about any ways to declare variable-sized arrays in kernel. You can, however, dynamically allocate device memory from host code and then use it from kernel.

DenisR · January 18, 2008, 6:36pm

You can only allocate shared memory dynamically outside of your kernel. Also local memory is not optimized as you think since your kernel does not run to completion for 1 warp at a time. All warps are ‘in flight’ on the multiprocessor, so each thread needs it own local memory

v01d · January 22, 2008, 1:23pm

Ok, thanks for the information.

Topic		Replies	Views
Newbie - Memory Model CUDA Programming and Performance	10	5195	March 26, 2008
Dynamic memory allocation CUDA Programming and Performance	4	2943	July 11, 2007
temporary memory issues CUDA Programming and Performance	11	5420	March 30, 2008
Variable array size within kernel? CUDA Programming and Performance	3	5265	September 5, 2008
where the variables will be stored declared inside the kernel CUDA Programming and Performance	5	4968	July 20, 2009
Efficient way of reading dynamic array in kernel? CUDA Programming and Performance	5	1666	July 12, 2010
How to create a dynamic size array in device? CUDA Programming and Performance	6	3600	August 26, 2008
how to know what variables are placed in local memory? CUDA Programming and Performance	9	5479	January 29, 2010
Dynamic Shared Memory allocation of more than one array CUDA Programming and Performance	4	4718	June 20, 2011
Where best to allocate memory On the local stack or in shared memory CUDA Programming and Performance	11	5539	January 26, 2009

Question about variables inside a kernel

Related topics