Compute Sanitizer always report "Target application terminated before first instrumented API call"

I am trying to use compute-sanitizer for my pytorch program.

However, I found that no matter what program I feed to compute-sanitizer, it always says “Target application terminated before first instrumented API call”.

I tried with the basic Memcheck sample provided in GitHub - NVIDIA/compute-sanitizer-samples: Samples demonstrating how to use the Compute Sanitizer Tools and Public API

Here is what I get.

❯ make run_memcheck
/opt/cuda/bin/nvcc -ccbin g++ -I/opt/cuda/include -MMD -c memcheck_demo.cu
/opt/cuda/bin/nvcc -ccbin g++  -o memcheck_demo memcheck_demo.o
/opt/cuda/compute-sanitizer/compute-sanitizer --destroy-on-device-error kernel memcheck_demo
========= COMPUTE-SANITIZER
========= Error: Target application terminated before first instrumented API call
========= Error: couldn't find exit code.
make: *** [Makefile:65: run_memcheck] Error 255
rm memcheck_demo.o
(gpu)

The same happens with python…

My installation is

❯ nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2024 NVIDIA Corporation
Built on Tue_Feb_27_16:19:38_PST_2024
Cuda compilation tools, release 12.4, V12.4.99
Build cuda_12.4.r12.4/compiler.33961263_0
(gpu)
compute-sanitizer-samples/Memcheck on  master [!?] via 🅒 gpu
❯ compute-sanitizer --version
NVIDIA (R) Compute Sanitizer
Copyright (c) 2020-2024 NVIDIA Corporation
Version 2024.1.0.0 (build 33961263) (public-release)
(gpu)

Also, I noticed that when launching CUDA program with compute-sanitizer, it does not show in the nvidia-smi.
However, when I run CUDA program WITHOUT compute-sanitizer, everything works fine and appears in nvidia-smi (so my CUDA should be OK?)

❯ ./memcheck_demo
Mallocing memory
Running unaligned_kernel: misaligned address
Running out_of_bounds_kernel: misaligned address

Thank you in advance!

OK, after some “strace”, the problem is solved.

It turns out that the Gentoo does not properly pack the compute-sanitizer and set “TreeLauncherSubreaper” and “TreeLauncherTargetLdPreloadHelper” non-executable.

…I’d appreciate it if the error msg is not that confusing…

Thanks for your report, where did you install CUDA from? Using the installer from the official NVIDIA website should not cause this issue. Also, it is recommended to always use the latest compute-sanitizer version to get all bug fixes (i.e. the one from CUDA 12.6 as of today). Thanks!

I installed it from the Gentoo repository.
It’s a Gentoo bug now LOL.