I am trying to use compute-sanitizer for my pytorch program.
However, I found that no matter what program I feed to compute-sanitizer, it always says “Target application terminated before first instrumented API call”.
I tried with the basic Memcheck sample provided in GitHub - NVIDIA/compute-sanitizer-samples: Samples demonstrating how to use the Compute Sanitizer Tools and Public API
Here is what I get.
❯ make run_memcheck
/opt/cuda/bin/nvcc -ccbin g++ -I/opt/cuda/include -MMD -c memcheck_demo.cu
/opt/cuda/bin/nvcc -ccbin g++ -o memcheck_demo memcheck_demo.o
/opt/cuda/compute-sanitizer/compute-sanitizer --destroy-on-device-error kernel memcheck_demo
========= COMPUTE-SANITIZER
========= Error: Target application terminated before first instrumented API call
========= Error: couldn't find exit code.
make: *** [Makefile:65: run_memcheck] Error 255
rm memcheck_demo.o
(gpu)
The same happens with python…
My installation is
❯ nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2024 NVIDIA Corporation
Built on Tue_Feb_27_16:19:38_PST_2024
Cuda compilation tools, release 12.4, V12.4.99
Build cuda_12.4.r12.4/compiler.33961263_0
(gpu)
compute-sanitizer-samples/Memcheck on master [!?] via 🅒 gpu
❯ compute-sanitizer --version
NVIDIA (R) Compute Sanitizer
Copyright (c) 2020-2024 NVIDIA Corporation
Version 2024.1.0.0 (build 33961263) (public-release)
(gpu)
Also, I noticed that when launching CUDA program with compute-sanitizer
, it does not show in the nvidia-smi
.
However, when I run CUDA program WITHOUT compute-sanitizer
, everything works fine and appears in nvidia-smi
(so my CUDA should be OK?)
❯ ./memcheck_demo
Mallocing memory
Running unaligned_kernel: misaligned address
Running out_of_bounds_kernel: misaligned address
Thank you in advance!