CUDA compute-sanitizer internal error: CUDA initialized before the Sanitizer

I’m trying to run compute-sanitizer on two systems: a docker with CUDA 11.2 and a cluster with CUDA 11.7. On both systems, whenever I’m running compute-sanitizer (with any arguments, although I’m mostly trying to use the --leak-check full, --save and --log-file ones), I’m getting this error:

========= COMPUTE-SANITIZER
========= Internal Sanitizer Error: CUDA initialized before the Sanitizer. The Sanitizer will be disabled.
=========

Does anyone know where this might come from ? I’m using Clang v14.0.0 & 14.0.6 to compile a C++ OpenMP-offload (GPU kernels are written using OpenMP rather than CUDA) program that also calls cuFFT & cuBLAS. I cannot seem to find any information online about this error…

Can you share the full command and output that is causing this? Is there any sort of job launcher/scheduler involved?

You’re right that there’s a scheduler involved (Slurm) on the cluster, although I get the same results inside the docker, where there is no scheduler involved at all (both use the same tool chains, with some minor version differences, mentioned in my initial post).

The run command is simply srun compute-sanitizer ./exe (I get the same errors with srun compute-sanitizer --leak-check full --save savefile --log-file logfile). (either with or without srun) The output:

========= COMPUTE-SANITIZER
========= Internal Sanitizer Error: CUDA initialized before the Sanitizer. The Sanitizer will be disabled.
========= 
/* non-relevant application output */
========= ERROR SUMMARY: 1 error

Maybe it’s worth mentioning it’s a MPI application as well (I’m using a CUDA-aware OpenMPI v4.1.x with UCX).

Thanks for the details. I have filed a bug with out engineering team. They may reply directly, otherwise I’ll let you know when I have some more information.

It would also be helpful if there was a simple reproducible test case you could share.