Error with -fcilkplus

phirus · September 9, 2016, 8:43am

Hi

I’m writing to show my experience about a problem so frustrating that I have suffered trying to execute this code:

void run (){

/***************/
/* TEST KERNEL */
/***************/
  __global__ void kernel() {
    printf("Constant GPU %d\n", const_symbol);
  }

  __constant__ __device__ int const_symbol;

  int buffer[8];
  int numDev_avail;
  printf ("Start run\n");

  gpuErrchk(cudaGetDeviceCount (&numDev_avail));
  printf ("Number of dev available: %d\n", numDev_avail);

  for (int i = 0; i < numDev_avail; i++) {
    printf ("Device number: %d\n", i);
    buffer[i] = i+30;  // Why 30??? why not??
    gpuErrchk(cudaSetDevice( i ));
    gpuErrchk(cudaMemcpyToSymbol(const_symbol, &(buffer[i]), sizeof(int), 0, cudaMemcpyHostToDevice));

  }

  for (int i = 0; i < numDev_avail; i++) {
    gpuErrchk(cudaSetDevice( i ));
    kernel<<<1,1>>>();
  }

The code is so simple and it show this error at execution time:

GPUassert: invalid device symbol */src/prueba.cu 44 [orange line]

It was compiled with the right arch (sm_35), after a couple of days looking for an answer I found it. It was because the flags -fcilkplus was propagated to the nvcc. It doesn’t care that the code doesn’t have any cilk sentence. The most crazy is that in the binary the symbol was there

cuobjdump (executable file) -arch sm_35 -symbols | grep -i symbol
symbols:
STT_OBJECT       STB_GLOBAL STV_DEFAULT    const_symbol

Should this happend??? Why the symbol is not located when it is there???

NOTE: The reason of use cilk set is I have to port a code that use it to a cuda

Thanks for your attention

tera · September 9, 2016, 9:10am

Compilation of CUDA programs follows a relatively complicated trajectory in order to split out device code, but still be able to use the unmodified host compiler on a variety of operating systems. This delicate process easily breaks in the presence of other language modifications that NVCC does not know about, often resulting in cryptic error messages.

I’d recommend to keep the CUDA-specific code in separate files that are compiled by NVCC without -fcilkplus, and to compile the remainder using the host compiler with -fcilkplus.

Linking can still be done with NVCC, or with the host linker provided the right linking options are used. running NVCC once with the --dryrun comes handy to determine those options.

phirus · September 9, 2016, 12:31pm

Yeah absolutely. I stopped of propagating the HOST flags to the nvcc and the problem disappeared.