__constant__ is not accessible in kernel

Hi all,
I’m stuck on a silly problem, which probably has a very simple answer. For some reason when I try to print out a variable declared as device constant
in the emulation mode, the code prints out zero, although the constant had been initialized. Say, in main.cu I have:

// main.cu+
device constant unsigned int gg_NX;

main()
{

unsigned int NX=300;
CUDA_SAFE_CALL( cudaMemcpyToSymbol(“gg_NX”, &NX, sizeof(unsigned int)) );

push();
}
// main.cu-

and in push.cu
// push.cu+
device __constant unsigned int gg_NX;

void push(void)
{
pushKernel <<< dimGrid,dimBlocks >>> ();
}

global void pushKernel(void)
{
printf(" pushKernel: gg_NX=%i \n", gg_NX);
}
// push.cu-

And when I run this program in the emulation mode, I get
pushKernel: gg_NX=0 instead of the expected gg_NX=300

Does anybody by any chance have a quick answer to my problem?

Variables declared as constant are IIRC file-scope. You actually have two separate gg_NXs there (curiously this doesn’t create name clashes?). You initialize the one from main.cu and you print the one in push.cu.

In fact, this shouldn’t even work in C unless you used the extern keyword.

if I retain the definition “device constant gg_NX” in main.cu, but in push.cu I put declaration “extern device constant gg_NX” in push.cu, the code compiles okay,

but there’s an error at the runtime. Namely, execution stops with a message “invalid device symbol” when the code tries to execute the cudaMemcpyToSymbol command in main.cu. I’ve tried different combinations of declarations/definitions, but I still cannot get to read the gg_NX constant in my kernel. I know I’m making a silly error somewhere, but I do not see it. Any idea about where it might be?

AFAIK externs are not allowed for constant memory. It’s stupid but that’s how it works now. You will need to physically have the declaration and initialization in the same file.

You might move the cudaMemcpyToSymbol to a wrapper function in push.cu perhaps?