Hi
I’m writing to show my experience about a problem so frustrating that I have suffered trying to execute this code:
void run (){
/***************/
/* TEST KERNEL */
/***************/
__global__ void kernel() {
printf("Constant GPU %d\n", const_symbol);
}
__constant__ __device__ int const_symbol;
int buffer[8];
int numDev_avail;
printf ("Start run\n");
gpuErrchk(cudaGetDeviceCount (&numDev_avail));
printf ("Number of dev available: %d\n", numDev_avail);
for (int i = 0; i < numDev_avail; i++) {
printf ("Device number: %d\n", i);
buffer[i] = i+30; // Why 30??? why not??
gpuErrchk(cudaSetDevice( i ));
gpuErrchk(cudaMemcpyToSymbol(const_symbol, &(buffer[i]), sizeof(int), 0, cudaMemcpyHostToDevice));
}
for (int i = 0; i < numDev_avail; i++) {
gpuErrchk(cudaSetDevice( i ));
kernel<<<1,1>>>();
}
The code is so simple and it show this error at execution time:
GPUassert: invalid device symbol */src/prueba.cu 44 [orange line]
It was compiled with the right arch (sm_35), after a couple of days looking for an answer I found it. It was because the flags -fcilkplus was propagated to the nvcc. It doesn’t care that the code doesn’t have any cilk sentence. The most crazy is that in the binary the symbol was there
cuobjdump (executable file) -arch sm_35 -symbols | grep -i symbol
symbols:
STT_OBJECT STB_GLOBAL STV_DEFAULT const_symbol
Should this happend??? Why the symbol is not located when it is there???
NOTE: The reason of use cilk set is I have to port a code that use it to a cuda
Thanks for your attention