I’d like to use the constant memory for my cuda program.
Since the kernel must access to several fields of a parameters structure, I thought to define it as constant memory. The structure is quite simple: there are just float and int fields. The first doubt is that in code samples I read usually the constant memories are arrays and not scalars, so I defined an array of 1 element to avoid any problem. The second doubt is that the kernel and the main are in different files. So I declared the constant memory in main.cu (where I copy the structure into the constant memory), and in the file containing the kernel I declared the variable as external.
The main works as follow:
-
I fill the structure in host memory
-
I copy the structure from host memory to costant memory with the function cudaMemcpyToSymbol
-
I inizialize the device memory that I need for the kernel
-
I launch the kernel
The main looks like:
[codebox]
typedef struct {
float wr;
float wh;
float V0;
float Va;
float w0;
int type;
} parameters;
constant parameters device_params[1];
int main(){
parameters params;
// …
// initialize params
// …
params.type=8;
cudaMemcpyToSymbol(device_params,¶ms,sizeof(parameters));
// …
kernel<<<Nblocks,Nthreads>>>(float *data);
// …
}
[/codebox]
in the kernel file I defined the constant memory as external
[codebox]
external constant parameters device_params[1];
global kernel(float *data){
//…
fprintf(stderr,“device_params[0].type=%d\n”,device_params[0].type);
}
[/codebox]
Now If I run the program in emulation mode and I print in the kernel the value of field type of device_params[1] the result is 0 instead of 8.
Since the cudaMemcpyToSymbol function (and also the other cuda functions) return cudaSuccess I don’t understand why within the kernel the value is wrong.
Maybe the nvcc doesn’t recognize the “external” keyword and defines another constant memory with the same name. Or maybe the emulation mode fails, but when I compile without emulation flag the result of program is wrong…
I use the cuda toolkit 2.3 under ubuntu 9.10 64 bit.