cudaMemcpyToSymbol do not copy data

I want to use constant memory which will be accessed by all threads across all of my kernels.

The declaration is something like this

extern constant float smooth [8 * 1024];

I am copying data to this variable using

cudaMemcpyToSymbol(“smooth”, smooth_local, smooth_size, 0, cudaMemcpyHostToDevice);

smooth_size = 7K bytes

It was giving me incorrect output

but when I run it in -deviceemu mode and tried to print the contents of both these variables inside the kernel, I was getting all zeroes for smooth and smooth_local was correct.

I tried printing the output just after cudaMemcpyToSymbol still it was giving me 0’s.

Can you anyone throw light on my problem?

You must declare GPU constant memory using the Cuda constant qualifier. I am pretty certain if you check the error status with cudaGetLastError after the cudaMemcpyToSymbol, you will find you are gettin an undefined symbol error.

sorry I missed the __ before and after constant

I m using this

extern constant float smooth [2 * 1024];

now when I am printing with cudaGetLastError , I m getting “invalid device symbol”.

When I removed “extern” it works.

constant float smooth [2 * 1024];