Compute sanitizer unable to attach to MPI processes


I have been working on a project that uses Intel MPI to spread work out over multi-gpu Azure nodes. Usually, the executable is run like this in a PBS file:

mpiexec -s all -np 8 -machinefile $PBS_NODEFILE <linked executable>

To use cuda memcheck, we have been able to simply run this:

mpiexec -s all -np 8 -machinefile $PBS_NODEFILE cuda-memcheck <linked executable>

This worked just fine and reported errors for each MPI rank. However, we have recently run into some issues that aren’t being reported and since memcheck is depreciated, I wanted to upgrade to compute sanitizer. But the following doesn’t work:

mpiexec -s all -np 8 -machinefile $PBS_NODEFILE compute-sanitizer <linked executable>

This returns “Error: No attachable process found. compute-sanitizer timed-out.” However, since I cannot run the executable directly from the nodes it runs on, I cannot manually attach the process (as far as I know). From the unanswered post here, I also tried the following to no avail:

mpiexec -s all -np 8 -machinefile $PBS_NODEFILE compute-sanitizer --require-cuda-init=no --max-connections=1000 <linked executable>

Has anyone been able to use the sanitizer as a direct drop in for memcheck when it comes to mpi processes? Is there something I am missing or need to fix in order for this to work? Any help is appreciated!

PS: I am using the v11.7 toolkit.

A few ideas:

  • Is your executable running subproceses? If that is the case then please try with option --target-processes all.
  • Is your executable 32-bit or executing 32-bit subprocesses? If that is the case then please try with option --support-32bit yes.

Please let me know if that helps!

Thank you so much for the help! The first option --target-processes all fixed the issue. Much appreciated!

1 Like

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.