Compute sanitizer hangs indefinitely while cuda-memcheck works as is

Compute sanitizer hangs indefinitely when a runtime conditional (graph->seq) is true and a cudaMemcpy is performed (unless cudaDeviceSynchronize added), while cuda-memcheck works as is whether the conditional is true or false.

EDITED: It appears this memory is uninitialized. I don’t think it is, but initcheck states it is.

Usage: compute-sanitizer --tool memcheck ./fitz_bin -f …/schwing_tests/blossreq.txt -u -r 1 -k 0 -c -p

========= Host API memory access error at host access to 0x7f9e55803800 of size 80 bytes
========= Uninitialized access at 0x7f9e55803840 on access by cudaMemcpy source
========= Saved host backtrace up to driver entry point at error

I have a kernel which will hang indefinitely using CS. What is strange is I can remedy the situation by adding a cudaDeviceSynchronize after the kernel. cuda-memcheck doesn’t need the cudaDeviceSynchronize.

Usage: compute-sanitizer --tool memcheck ./fitz_bin -f …/schwing_tests/blossreq.txt -u -r 1 -k 0 -c -p

The code is a private repo, but the essential outline is posted below.

 *c=1;
  while (*c){
    *c=0;
    *path_extracted=0;

    cudaEventRecord(start);
    kernel1 <<<dimGrid,THREADS_PER_BLOCK>>> (...);
    cudaEventRecord(stop);
    cudaEventSynchronize(stop);
    cudaEventElapsedTime(...);
    t_set_b_type_t += t_set_b_type;
    //printf("MUTEX %d\n",*c);
    if(*c){
      cudaEventRecord(start);
      kernel2 <<<1,1>>> (...);
      cudaEventRecord(stop);
      cudaEventSynchronize(stop);
      cudaEventElapsedTime(...);

    }
    if (*c && *path_extracted){
      cudaEventRecord(start);
      kernel3<<<1,32>>> ((...);, c,path_extracted);
      cudaEventRecord(stop);
      cudaEventSynchronize(stop);
      cudaEventElapsedTime((...););

      cudaEventRecord(start);
      kernel4<<<1,THREADS_PER_BLOCK>>>((...););
      kernel5 <<<dimGrid,THREADS_PER_BLOCK>>> (...);
      cudaEventRecord(stop);
      cudaEventSynchronize(stop);
      cudaEventElapsedTime(...);
    }
  }

  // THIS IS NECESSARY FOR COMPUTE-SANITIZER TO WORK WHEN CHECKING?!?!?
  //cudaDeviceSynchronize();
  if (graph->seq){
    checkCudaErrors(cudaMemcpy(..))
  }

Thanks for reaching out. I’ve contacted the Compute Sanitizer engineering team to see if we can get to the bottom of this. I’ll let you know when I have more information.

Would you be able to provide a buildable and executable reproducer for this issue? You can send it as a message if you don’t want to post it in the thread. The engineering team isn’t able to tell what could be causing the issue from the limited snippet.