how nvml tell there is "running process" on certain GPU card?

Hi all,

I am coding a simple multi-GPU scheduler, using APIs from nvml.dll.
Can double check, when will “nvmlDeviceGetComputeRunningProcesses” tell there is a process running?
from cudaSetDevice()?
or, from 1st cudaMalloc?
or, from 1st kernel call?

From my testing it seems to be 1st cudaMalloc, but I can’t find any API document regarding this.


It is detecting whether a cuda context is established on the device. If you don’t know what a cuda context is, please read the relevant section of the programming guide. A host process and CUDA context are closely related.

I got it, thanks.
A side question: If GPU devices are limited, regarding performance, is sharing of GPU device a good idea? (Multiple contexts from different host processes), or it’s better to run 1 job at 1 time.


multiple contexts from different host processes will context-switch on the GPU. Kernels from separate host processes cannot run concurrently (excepting with the use of CUDA MPS).

Thanks for the reply.
Will the impact from context switching be substantial?
My real scenario is like:

Card: K40m
Workflow of both Host Process A & B:
1, cudaMalloc
2, 10,000 separate kernel calls (each one is very fast)
3, cudaFree

The part worrying me is that whether there will be a substantial performance impact from sharing. If so, I will need to fine tune my GPU schedule to achieve “1 job at 1 time”.