Constant memory usage in CUDA code

markhor · January 29, 2012, 8:29pm

I can not figure it out myself, what is the best way to ensure the memory used in my kernel is constant. There is a similar question at c++ - How to use CUDA constant memory in a programmer pleasant way? - Stack Overflow.

I am working with GTX580 and compiling only for 2.0 capability. My kernel looks like

global Foo(const int *src, float *result) {…}

I execute the following code on host:

cudaMalloc(src, size);

cudaMemcpy(src, hostSrc, size, cudaMemcpyHostToDevice);

Foo<<<…>>>(src, result);

the alternative way is to add

constant src;

to .cu file, remove src pointer from the kernel and execute

cudaMemcpyToSymbol(“src”, hostSrc, size, 0, cudaMemcpyHostToDevice);

Foo<<<…>>>(result);

Are these two ways equivalent or the first one does not guarantee the usage of constant memory instead of global memory? size changes dynamically so the second way is not handy in my case.

cmaster.matso · January 30, 2012, 9:27am

I can not figure it out myself, what is the best way to ensure the memory used in my kernel is constant. There is a similar question at c++ - How to use CUDA constant memory in a programmer pleasant way? - Stack Overflow.

I am working with GTX580 and compiling only for 2.0 capability. My kernel looks like

global Foo(const int *src, float *result) {…}

I execute the following code on host:

cudaMalloc(src, size);

cudaMemcpy(src, hostSrc, size, cudaMemcpyHostToDevice);

Foo<<<…>>>(src, result);

the alternative way is to add

constant src;

to .cu file, remove src pointer from the kernel and execute

cudaMemcpyToSymbol(“src”, hostSrc, size, 0, cudaMemcpyHostToDevice);

Foo<<<…>>>(result);

Are these two ways equivalent or the first one does not guarantee the usage of constant memory instead of global memory? size changes dynamically so the second way is not handy in my case.

As far as I know using ‘const’ as a kernel parameter modifier means the same as in host C/C++ - You cannot change such a variable, it is read-only. The ‘constant’ memory array/pointer/variable means that when accessing it one read will be done for half-warp of threads. See CUDA documentation for more precise information, please.

markhor · January 30, 2012, 10:54am

Well, on StackOverflow it is written that

That’s why I am so interested.

laughingrice · February 5, 2012, 1:00pm

specifying the pointer as const is a protective measure. It declares that the data is input only and not input / output and should not be changed in the function. The compiler checks that it is not written to and is allowed to make optimizations based on that assumption. It does not alter the location of the memory.

Using the const keyword says that the data is accessed via the constant cache, which is something completely different. Although theoretically possible (as OpenCL works this way), I doubt that the compiler will automatically read the input via the constant cache as it’s highly prone to performance problems.

The constant cache is limited in size (not just the cache, but the entire buffer size is limited to 64KB per kernel)
The constant cache has different required access patterns than global memory. It only works well if all threads in the warp access exactly the same memory address (a scatter operation) otherwise there is access serialization.

constant cache is very good for storing things such as convolution kernels, but is not very good for general purpose access.

Topic		Replies	Views
Why does it need constant memory CUDA Programming and Performance	4	807	October 10, 2013
Am I using the constant memory properly? CUDA Programming and Performance	4	1765	May 15, 2013
__constant__ memory in function scope CUDA Programming and Performance	13	4533	June 1, 2015
Cuda constant memory CUDA Programming and Performance cuda , kernel	5	1720	September 7, 2023
Can I use __constant__ memory with pointer to it as kernel arg CUDA Programming and Performance	4	4427	April 2, 2011
Global arrays? CUDA Programming and Performance	24	10612	August 18, 2010
constant cache no faster than global mem? constant memory access CUDA Programming and Performance	5	5697	March 6, 2008
Small const array accessable globally? Is it easy and possible? CUDA Programming and Performance	6	1405	April 16, 2009
why constant memory doesn't work in my example application CUDA Programming and Performance	3	8387	January 10, 2011
__constant__ use CUDA Programming and Performance	6	15140	June 14, 2008

Constant memory usage in CUDA code

Related topics