Compute sanitizer hangs indefinitely when a runtime conditional (graph->seq) is true and a cudaMemcpy is performed (unless cudaDeviceSynchronize added), while cuda-memcheck works as is whether the conditional is true or false.
EDITED: It appears this memory is uninitialized. I don’t think it is, but initcheck states it is.
Usage: compute-sanitizer --tool memcheck ./fitz_bin -f …/schwing_tests/blossreq.txt -u -r 1 -k 0 -c -p
========= Host API memory access error at host access to 0x7f9e55803800 of size 80 bytes
========= Uninitialized access at 0x7f9e55803840 on access by cudaMemcpy source
========= Saved host backtrace up to driver entry point at error
I have a kernel which will hang indefinitely using CS. What is strange is I can remedy the situation by adding a cudaDeviceSynchronize after the kernel. cuda-memcheck doesn’t need the cudaDeviceSynchronize.
Usage: compute-sanitizer --tool memcheck ./fitz_bin -f …/schwing_tests/blossreq.txt -u -r 1 -k 0 -c -p
The code is a private repo, but the essential outline is posted below.
*c=1;
while (*c){
*c=0;
*path_extracted=0;
cudaEventRecord(start);
kernel1 <<<dimGrid,THREADS_PER_BLOCK>>> (...);
cudaEventRecord(stop);
cudaEventSynchronize(stop);
cudaEventElapsedTime(...);
t_set_b_type_t += t_set_b_type;
//printf("MUTEX %d\n",*c);
if(*c){
cudaEventRecord(start);
kernel2 <<<1,1>>> (...);
cudaEventRecord(stop);
cudaEventSynchronize(stop);
cudaEventElapsedTime(...);
}
if (*c && *path_extracted){
cudaEventRecord(start);
kernel3<<<1,32>>> ((...);, c,path_extracted);
cudaEventRecord(stop);
cudaEventSynchronize(stop);
cudaEventElapsedTime((...););
cudaEventRecord(start);
kernel4<<<1,THREADS_PER_BLOCK>>>((...););
kernel5 <<<dimGrid,THREADS_PER_BLOCK>>> (...);
cudaEventRecord(stop);
cudaEventSynchronize(stop);
cudaEventElapsedTime(...);
}
}
// THIS IS NECESSARY FOR COMPUTE-SANITIZER TO WORK WHEN CHECKING?!?!?
//cudaDeviceSynchronize();
if (graph->seq){
checkCudaErrors(cudaMemcpy(..))
}