Nvml overhead on latency of applications running on the GPU

Developer Tools Other Tools System Management and Monitoring (NVML)

kcsujan August 17, 2020, 9:11pm 1

Hi,
I am planning to query the status of my Gpu every 1 second using NVML while an inference process is running on the GPU.
Does querying gpu status through NVML have any side effect on the latency of the inference? I do not want the inference latency to be affected.
Inference latency implies time it takes to run a machine learning model on the gpu with one batch of inputs.

Thank you,
Sujan

Topic		Replies	Views
NVML overhead CUDA Programming and Performance	6	2365	March 24, 2020
Online Power Optimization with Feedback of the Performance System Management and Monitoring (NVML)	0	47	July 15, 2024
APIs of NVML to get all the information of GPU GPU-Accelerated Libraries	0	518	February 12, 2020
NVML issue with virtual A100 System Management and Monitoring (NVML)	0	1351	July 14, 2022
Can I profile GPU utilization with a period shorter than 1/6 sec? System Management and Monitoring (NVML)	0	488	March 12, 2020
Maximum Sampling Rate of GPU Power Measurement using NVML System Management and Monitoring (NVML)	0	1179	January 2, 2020
NVIDIA Management Library CUDA Programming and Performance	3	1588	December 19, 2011
nvmlDeviceGetProcessesUtilizationInfo never sets lastSeenTimeStamp System Management and Monitoring (NVML)	1	500	February 28, 2024
Getting GPU utilization for compute kernels only? System Management and Monitoring (NVML)	0	708	July 16, 2019
Overclocking doesn't work on Maxwell GPUs System Management and Monitoring (NVML)	14	569	February 28, 2026

Nvml overhead on latency of applications running on the GPU

Related topics