I found it possible to get the pcie usage per device from nvml api, but I wanna know if it’s possible to get pcie usage per process? Is it possible and is there such an API provided by nvidia or any other third-party libraries? Thank you!
There is no per process API.
- You must monitor bandwidth inside process cudaEventRecord()/cudaEventElapsedTime() (like NVIDIA_CUDA-10.0_Samples/1_Utilities/bandwidthTest/)
- Or globaly by "nvidia-smi dmon -s t" or with API nvmlDeviceGetPcieThroughput() (https://docs.nvidia.com/deploy/nvml-api/group__nvmlDeviceQueries.html#group__nvmlDeviceQueries_1gd86f1c74f81b5ddfaa6cb81b51030c72) (This function is querying a byte counter over a 20ms interval and thus is the PCIe throughput over that interval.)