Problem with costant memory Can I define it as external

I’d like to use the constant memory for my cuda program.

Since the kernel must access to several fields of a parameters structure, I thought to define it as constant memory. The structure is quite simple: there are just float and int fields. The first doubt is that in code samples I read usually the constant memories are arrays and not scalars, so I defined an array of 1 element to avoid any problem. The second doubt is that the kernel and the main are in different files. So I declared the constant memory in (where I copy the structure into the constant memory), and in the file containing the kernel I declared the variable as external.

The main works as follow:

  1. I fill the structure in host memory

  2. I copy the structure from host memory to costant memory with the function cudaMemcpyToSymbol

  3. I inizialize the device memory that I need for the kernel

  4. I launch the kernel

The main looks like:


typedef struct {

    float wr;

    float wh;

    float V0;

    float Va;

    float w0;

    int type;

} parameters;

constant parameters device_params[1];

int main(){

parameters params;

// …

// initialize params

// …



// …

kernel<<<Nblocks,Nthreads>>>(float *data);

// …



in the kernel file I defined the constant memory as external


external constant parameters device_params[1];

global kernel(float *data){





Now If I run the program in emulation mode and I print in the kernel the value of field type of device_params[1] the result is 0 instead of 8.

Since the cudaMemcpyToSymbol function (and also the other cuda functions) return cudaSuccess I don’t understand why within the kernel the value is wrong.

Maybe the nvcc doesn’t recognize the “external” keyword and defines another constant memory with the same name. Or maybe the emulation mode fails, but when I compile without emulation flag the result of program is wrong…

I use the cuda toolkit 2.3 under ubuntu 9.10 64 bit.

I’d like to use the constant memory for my cuda program.

Since the kernel must access to several fields of a parameters structure, I thought to define it as constant memory. The structure is quite simple: there are just float and int fields. The first doubt is that in code samples I read usually the constant memories are arrays and not scalars, so I defined an array of 1 element to avoid any problem. The second doubt is that the kernel and the main are in different files. So I declared the constant memory in (where I copy the structure into the constant memory), and in the file containing the kernel I declared the variable as external.

The main works as follow:

  1. I fill the structure in host memory

  2. I copy the structure from host memory to costant memory with the function cudaMemcpyToSymbol

  3. I inizialize the device memory that I need for the kernel

  4. I launch the kernel

The main looks like:


typedef struct {

    float wr;

    float wh;

    float V0;

    float Va;

    float w0;

    int type;

} parameters;

constant parameters device_params[1];

int main(){

parameters params;

// …

// initialize params

// …



// …

kernel<<<Nblocks,Nthreads>>>(float *data);

// …



in the kernel file I defined the constant memory as external


external constant parameters device_params[1];

global kernel(float *data){





Now If I run the program in emulation mode and I print in the kernel the value of field type of device_params[1] the result is 0 instead of 8.

Since the cudaMemcpyToSymbol function (and also the other cuda functions) return cudaSuccess I don’t understand why within the kernel the value is wrong.

Maybe the nvcc doesn’t recognize the “external” keyword and defines another constant memory with the same name. Or maybe the emulation mode fails, but when I compile without emulation flag the result of program is wrong…

I use the cuda toolkit 2.3 under ubuntu 9.10 64 bit.

That it what is happening. All cuda symbols are compiled at file scope. Because there is no linker for cuda code, there is no way to implement extern.

That it what is happening. All cuda symbols are compiled at file scope. Because there is no linker for cuda code, there is no way to implement extern.

That it what is happening. All cuda symbols are compiled at file scope. Because there is no linker for cuda code, there is no way to implement extern.

In fact I tried to debug my code with cuda-gdb and inside the kernel defined in the other file all fields were set to zero, so I create a kernel inside the main just for printing the value of device memory and in this second kernel the values were correct.
Now I made a function in the kernel file that wraps the cudaMemcpyToSymbol for the main… I hope to solve the problem in that way

In fact I tried to debug my code with cuda-gdb and inside the kernel defined in the other file all fields were set to zero, so I create a kernel inside the main just for printing the value of device memory and in this second kernel the values were correct.
Now I made a function in the kernel file that wraps the cudaMemcpyToSymbol for the main… I hope to solve the problem in that way

In fact I tried to debug my code with cuda-gdb and inside the kernel defined in the other file all fields were set to zero, so I create a kernel inside the main just for printing the value of device memory and in this second kernel the values were correct.
Now I made a function in the kernel file that wraps the cudaMemcpyToSymbol for the main… I hope to solve the problem in that way