cuda constant memory all zeros

__constant__ float c_kernel[KERNEL_LENGTH];

extern "C" void setGaussianKernel(float *h_Kernel,int kernelL)


    for(int i = 0;i < kernelL;i ++) printf("%f ",h_Kernel[i]);


    cutilSafeCall(cudaMemcpyToSymbol(c_kernel, h_Kernel, kernelL*sizeof(float)));


__global__ void test_kernel( ... )


    ... ...

if(i < KERNEL_LENGTH) dev_result[i] = c_kernel[i];


After the code above is executed, dev_result is all zeros. h_kernel is checked to be correct before cudaMemcpyToSymbol. The cudaMemcpyToSymbol is definitely called while kernelL and KERNEL_LENGTH share the same value. Can someone shine some light on this?

PS: Other kernels in different .cu files are using other constant memory, I am not sure that wound affect the result or not. Also, the constant memory usage is small, which doesn’t exceed 64K.


cutilSafeCall(cudaMemcpyToSymbol("c_kernel", h_Kernel, kernelL*sizeof(float)));

It seems it can distinguish the current constant memory from other ones I use in other .cu files. Yet, the results are still zeros.

Sigh… I am sorry. I am just silly. The floats in Gaussian kernel is cast to integers.Thanks for the help.