Use/Copy 'volatile' array in CUDA


I was wondering if anyone has had any significant experience using the ‘volatile’ declaration for CUDA, as the Programmer’s Guide has only a small section on it. From the programmer’s guide, the extent of info we get is “If a variable located in global or shared memory is declared as volatile, the compiler assumes that its value can be changed at any time by another thread”, which is exactly what I want.

So, what I want is basically want is this: I want an array of elements I initialize to some values on the CPU. I am going to have the threads of my kernels accessing/modifying them at any time, so I want the other threads to see the changes (hence the volatile declaration). The code I basically want is like:

[codebox] int cpu_foo[10] = {2,3,5,7,11,13,17,19,23,29};

volatile int* foo = 0;

cutilSafeCall(cudaMalloc((void**) &foo, sizeof(int)*10));

cutilSafeCall(cudaMemcpy(foo,cpu_foo, sizeof(int)*10, cudaMemcpyHostToDevice));[/codebox]

However, when I try and compile, I get the following complaint:

home/nere/Desktop/CUDA_NN/src/ error: argument of type “volatile int *” is incompatible with parameter of type “void *”

Any way to create a volatile array in memory from the host side and initialize it? Thanks!

Simply typecast foo to (void *) when calling cudaMemcpy().