I’m running a CUDA application under the MPI environment on a GPU cluster. Each compute node has 2 GPUs and I have defined to the MPI scheduler that only 2 processes can run on each node, i.e. one per GPU. How can I query that a GPU is being utilized or not? This is fairly straightforward on a visual inspection basis using the nvidia-smi. However, I should (at least want to) able to do it using the CUDA Runtime API.
Can it be done?