CUDA compute-sanitizer internal error: CUDA initialized before the Sanitizer

JYL · September 12, 2022, 10:26am

I’m trying to run compute-sanitizer on two systems: a docker with CUDA 11.2 and a cluster with CUDA 11.7. On both systems, whenever I’m running compute-sanitizer (with any arguments, although I’m mostly trying to use the --leak-check full, --save and --log-file ones), I’m getting this error:

========= COMPUTE-SANITIZER
========= Internal Sanitizer Error: CUDA initialized before the Sanitizer. The Sanitizer will be disabled.
=========

Does anyone know where this might come from ? I’m using Clang v14.0.0 & 14.0.6 to compile a C++ OpenMP-offload (GPU kernels are written using OpenMP rather than CUDA) program that also calls cuFFT & cuBLAS. I cannot seem to find any information online about this error…

jmarusarz · September 12, 2022, 4:24pm

Can you share the full command and output that is causing this? Is there any sort of job launcher/scheduler involved?

JYL · September 12, 2022, 4:48pm

You’re right that there’s a scheduler involved (Slurm) on the cluster, although I get the same results inside the docker, where there is no scheduler involved at all (both use the same tool chains, with some minor version differences, mentioned in my initial post).

The run command is simply srun compute-sanitizer ./exe (I get the same errors with srun compute-sanitizer --leak-check full --save savefile --log-file logfile). (either with or without srun) The output:

========= COMPUTE-SANITIZER
========= Internal Sanitizer Error: CUDA initialized before the Sanitizer. The Sanitizer will be disabled.
========= 
/* non-relevant application output */
========= ERROR SUMMARY: 1 error

Maybe it’s worth mentioning it’s a MPI application as well (I’m using a CUDA-aware OpenMPI v4.1.x with UCX).

jmarusarz · September 12, 2022, 9:27pm

Thanks for the details. I have filed a bug with out engineering team. They may reply directly, otherwise I’ll let you know when I have some more information.

It would also be helpful if there was a simple reproducible test case you could share.

jmarusarz · September 26, 2022, 2:27pm

The engineering team is requesting a reproducer to try and debug this internally. Do you have something you can provide?

JYL · September 29, 2022, 1:40pm

This piece of code reproduces the mentioned bug (with CUDA-aware, UCX-based OpenMPI).

#include "mpi.h"

int main(int argc, char *argv[]) {
    MPI_Init(&argc, &argv);
    MPI_Finalize();
}

aladram · October 4, 2022, 6:20pm

Could you please provide more information regarding:

Which version of OpenMPI / UCX you use, and where we can we download them / how we can install them
Which command-line do you use to compile / run this sample

This error usually happens where CUDA is initialized from a static library initialization code (e.g. initialization of a global variable in one of your libraries), which is not a use case supported by the tool. You can check by running gdb on this program and breaking on symbol cuInit, then looking at the backtrace to see which library calls it.

john.t014 · February 3, 2023, 6:04pm

I’m getting a similar error with an OpenMP offloading code built with LLVM. Below is a minimal OpenMP reproducer and what running it looks like. Notably, the --require-cuda-init=no argument appears to have no effect.

user@gpu07:~/omp_target_issues/simple$ cat main.cpp
#include <stdio.h>
#include <omp.h>
int main( int argv, char** argc ) {
  int is_initial = omp_is_initial_device();
  #pragma omp target map(from:is_initial)
  {
    is_initial = omp_is_initial_device();
  }
  if( !is_initial )
    printf( "Hello world from accelerator.\n" );
  else
    printf( "Hello world from host.\n" );
  return 0;
}
user@gpu07:~/omp_target_issues/simple$ clang++ --version
clang version 16.0.0 (git@github.com:llvm/llvm-project.git 124f90bd89b97066e01274a9bba1068f3a175d66)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /gpfs/jlse-fs0/projects/intel_anl_shared/openmc_data/compilers/llvm-16-rc1/bin
user@gpu07:~/omp_target_issues/simple$ clang++ -Wall -fopenmp -fopenmp-targets=nvptx64 -Xopenmp-target -march=sm_80 main.cpp -o test
clang-16: warning: CUDA version 11.6 is only partially supported [-Wunknown-cuda-version]
clang-16: warning: CUDA version 11.6 is only partially supported [-Wunknown-cuda-version]
user@gpu07:~/omp_target_issues/simple$ 
user@gpu07:~/omp_target_issues/simple$ ./test
Hello world from accelerator.
user@gpu07:~/omp_target_issues/simple$ compute-sanitizer --tool=initcheck ./test
========= COMPUTE-SANITIZER
========= Error: CUDA initialized before the Sanitizer. The Sanitizer will be disabled
========= 
Hello world from accelerator.
========= ERROR SUMMARY: 1 error
user@gpu07:~/omp_target_issues/simple$ compute-sanitizer --tool=initcheck --require-cuda-init=no ./test
========= COMPUTE-SANITIZER
========= Error: CUDA initialized before the Sanitizer. The Sanitizer will be disabled
========= 
Hello world from accelerator.
========= ERROR SUMMARY: 1 error
user@gpu07:~/omp_target_issues/simple$ nvidia-smi
Fri Feb  3 17:59:26 2023       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 520.61.05    Driver Version: 520.61.05    CUDA Version: 11.8     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA A100-PCI...  On   | 00000000:43:00.0 Off |                    0 |
| N/A   24C    P0    33W / 250W |      0MiB / 40960MiB |      0%      Default |
|                               |                      |             Disabled |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

aladram · February 3, 2023, 6:23pm

--require-cuda-init no is designed to allow non-CUDA programs to be executed with Sanitizer (the tool will not return an error). This error is different: it signals that the CUDA driver was initialized before the tool is injected. I will compile clang locally and try to reproduce this.

aladram · February 3, 2023, 11:10pm

I have identified the issue and I am currently evaluating whether or not a fix on our side is possible without making changes to the LLVM codebase.

john.t014 · February 6, 2023, 5:59pm

Thanks @aladram! I’ll also note that the cuda-memcheck utility does work fine, so it seems like a fix should be possible on the NVIDIA side.

markdewing · April 14, 2023, 6:47pm

I’m curious if there’s any progress on this issue? Will it ever be possible to use compute-sanitizer with LLVM-compiled OpenMP code?

For others reading this, what happens is OpenMP initializes the target plugins during shared library loading (when libomptarget is loaded) and that initializes CUDA before compute-sanitizer gains control of the application.

aladram · September 6, 2023, 6:41pm

Thanks Mark for your message. Support for LLVM-compiled OpenMP offloaded code will be considered in a future release but is not planned at this time.

Topic		Replies	Views
Compute Sanitizer not able to detect memory leak when using cuMemAlloc and OpenACC Compute Sanitizer	3	1828	November 27, 2023
CUDA & openMP Problem with the SDK sample code CUDA Programming and Performance	11	14000	September 12, 2015
Compute-sanitizer error : Target application terminated before first instrumented API call Compute Sanitizer	3	1997	May 25, 2023
Failed to initialize NVML: Unknown Error when running nvidia-smi on Docker container CUDA Programming and Performance cuda , ubuntu , docker	2	10367	October 18, 2020
CUDA sample throwing error CUDA on Windows Subsystem for Linux	46	22874	April 29, 2022
OpenMP doesn't work in a templated function CUDA Programming and Performance	4	2248	September 14, 2009
CUDA Toolkit and Driver 2.3a for OS X released CUDA Programming and Performance	34	35285	October 20, 2009
Compute Sanitizer for OpenAcc and OpenMPI Compute Sanitizer	2	1312	March 9, 2023
Compute-sanitizer --kernel-name-exclude option not filtering kernels in CUDA 12.8 Compute Sanitizer cuda	0	69	February 12, 2025
Is compute-sanitizer compatible with NVDec? Compute Sanitizer	45	1955	March 19, 2024

CUDA compute-sanitizer internal error: CUDA initialized before the Sanitizer

Related topics