Hello. I am trying to profile multiple programs executed in the same machine. However, it seems that the second program launched with ncu does not progress. Is it not possible to launch more than two ncu profiler? To be more specific, I am launching ncu inside isolated containers. (It is not a problem with docker containers, as I have already checked that profiling in docker container works well)
Additionally, if the above illustrated situation is not possible, can you tell why it is not possible? Is it due to some global locking in the CUDA driver or somewhat else? It is very odd to think about the core issue as NCU now even supports multi process profiling.
You can launch multiple instances of the tool and profile, however Nsight Compute will serialize them via a filesystem-level lock because the driver performance monitor is a shared resource. There is a bit more information on this required serialization here Kernel Profiling Guide :: Nsight Compute Documentation
Thank you very much. I made a workaround by following the docs on your shared docs. I mitigated the problem by exporting different TMPDIRs in each docker containers. Thereby, each NCU profiler in the containers used different locks thereby not hang.
The driver performance monitor is a shared resource for the hardware, including when multiple containers are used. Your solution may still run into issues. There is a documented pre-requisite in the workaround for modifying this temp directory:
- If it is otherwise ensured that no concurrent NVIDIA Nsight Compute instances are active on the same system, set TMPDIR to a different directory for which the current user has write permissions.
In your environment it is still possible that concurrent NVIDIA Nsight Compute instances are active on the same system. This is a limitation of the hardware, that can’t easily be avoided. I wanted to make sure that was clear in case you run into the issue again in the future.
This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.