Using Constant memory to hold Parameters

Gorune · April 3, 2013, 4:31pm

Hello! I have a CUDA program where I call a few kernels multiple thousands of times. These kernels take struct arguments which contain a few dozens of data members. I understand that every time a kernel is launched the arguments must be copied to the GPU (I think into constant memory). Therefore I thought that because most of these arguments are not changing I could store them directly in constant memory and avoid the slight overhead.

The first question is, is there any possible performance gain to this constant memory switch?

The second question is, how? It looks like one cannot simply put a struct in constant memory)- CUDA C Programming Guide v4.2 D.2.1.1. I decided to put painfully put each data member in constant memory, however somewhere I am going wrong in this logic or implementation:
1- Some of the members are arrays
2- I do not want to store the arrays in the constant memory
3- I only need a pointer to the array in the constant memory

I am trying the following:
I create the struct on the CPU, whose array members are allocated on the GPU, the rest are constants or flags. I am trying to cudaMemcpyToSymbol the reference of the created GPU array into a constant pointer to a pointer as in the sample below:

__constant__ int** my_struct_a;
...
    cudaError_t error1_a = cudaMemcpyToSymbol(my_struct_a, &(data_cpu.a), sizeof(int**));

where ‘data_cpu’ is the struct and the data member ‘a’ is an int* allocated on the GPU.

Although ‘error1_a’ is a success, when the kernel tries to access the actually data it fails. Debugging the kernel shows that ‘my_struct_a’ is a ???, or simply failed to read memory.

How can I fix this? If you need any details to clarify things, I will be happy to provide them.

Gorune · April 10, 2013, 10:04am

Some indications please?

wlangdon · April 10, 2013, 10:42am

There was some discussion of constant memory on this list before.
It turns out that constant memory can have high overheads, which make
it less attractive than it first appears. Eg if threads read different
words they cannot do this in parallel but must wait for the other
threads. Also constant memory is not really on chip memory at all
but a much smaller read-only cache to a 64KB chunk of
global memory.
See section 8.4 “Creating and Debugging Performance CUDA C”, W. B. Langdon

Topic		Replies	Views
Why does it need constant memory CUDA Programming and Performance	4	810	October 10, 2013
Am I using the constant memory properly? CUDA Programming and Performance	4	1770	May 15, 2013
__constant__ memory failed to load. CUDA Programming and Performance	3	2575	August 25, 2007
SCOPE __constant__ memory issue CUDA Programming and Performance	5	2099	December 5, 2011
Constant memory or function parameters? CUDA Programming and Performance	1	1220	November 7, 2008
Cuda constant memory CUDA Programming and Performance cuda , kernel	5	2064	September 7, 2023
Passing a pointer to constant memory CUDA Programming and Performance	1	1047	January 6, 2009
Structure into __constant__memory? CUDA Programming and Performance	2	1709	July 4, 2018
Can I use __constant__ memory with pointer to it as kernel arg CUDA Programming and Performance	4	4429	April 2, 2011
Passing a pointer to constant memory CUDA Programming and Performance	0	6277	January 3, 2009

Using Constant memory to hold Parameters

Related topics