Hi, we have a python script that calls into a C++/CUDA library we have written. I wanted to run said script through Compute Sanitizer with compute-sanitizer –tool memcheck python ./path/to/script.
I can’t share actual code since it’s confidential IP for work, so I will try to break down the steps. In theory CUDA shouldn’t be initialised here as the order of operations should be: run compute sanitizer → load up Python instance → run script → calls library function → call CUDA functions. And it should only be after the first call to a library function that CUDA is initialised. That should be plenty of time for Compute Sanitizer to get setup and ready to analyse the code.
Searching online I could only find a result that was related to a scheduler (which I don’t think is the same) or to use the cuda-no-init flag, which is also not what I need because I want it to check for CUDA.
Thanks Louis for the bug report! This is a known issue unfortunately, we have a potential fix but I cannot provide a timeline on when this will get fixed exactly unfortunately. I’ll come back to you here once we have a release target for the fix.
I would recommend printing a stack trace of where cuInit()is invoked in your code (using GDB), and if my theory is correct (done from a global initializer), then you could move it somewhere else that would execute after compute-sanitizer’s injections libraries are preloaded. More information can be found in dl-init.c and other related source files from ld.so. Thanks!
Oh, we don’t use the driver API. We use the runtime API which if I understand correctly would initialise the runtime on the first cudaXXX function call, no?