I have some questions about compute-sanitizer memcheck tool. I am assuming that it has the potential to identify all illegal memory access allocated from official CUDA APIs (cudaMalloc, cudaMallocAsync, cudaMallocFromPoolAsync). Is it correct? One step further, if somebody’s developed a custom GPU memory allocator (to avoid repetitive calls to cudaMalloc), can compute-sanitizer still be used to identify illegal memory access? I am assuming that self-developed memory manager may hide a lot of bugs from compute-sanitizer. Maybe this is one of the reasons to stay away from self-developed memory manager for GPU memory pool?
Correct, Sanitizer does intercept CUDA API calls to track memory allocations. For custom memory pools, developers can support report them to the tool through the NVTX API for Compute Sanitizer. An example on how to use the API is available on our GitHub samples repo here. Thanks!
Thanks. I am also facing an issue where compute-sanitizer is stuck when executing with a whole software. Is this something your team may help? More precisely, suppose I have some very heavy system to run and I would like to apply sanitizer check on the system. In a general case, compute-sanitizer would get the system execution stuck somewhere in the middle (from what I observed). Is there a way I can specify to sanitizer a small of chunk of code to check and leave the rest alone? If there is, can it be applied to alleviate the burden from running compute-sanitizer?
After some debugging, it is clear that compute-sanitizer is stuck on cudaGetDeviceCount
. Is it a known bug? How to resolve it? I am currently using the following combination:
NVIDIA (R) Compute Sanitizer
Copyright (c) 2020-2023 NVIDIA Corporation
Version 2023.3.0.0 (build 33281558) (public-release)
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 545.23.08 Driver Version: 545.23.08 CUDA Version: 12.3 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 Tesla V100-PCIE-32GB On | 00000000:3B:00.0 Off | 0 |
| N/A 30C P0 25W / 250W | 0MiB / 32768MiB | 0% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
+---------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
| No running processes found |
+---------------------------------------------------------------------------------------+