How does the frequency setting interface of NVML affect NCCL communication?

771635602 · December 25, 2024, 8:00am

During model training, I performed high-frequency DVFS adjustments using pynvml.nvmlDeviceSetGpuLockedClocksfor frequency tuning. Based on my experimental results, high-frequency calls to nvmlDeviceSetGpuLockedClocks(approximately every 50ms) caused significant delays in certain NCCL communication operators. This delay does not seem to be related to the frequency value itself; if I fix a very low frequency and avoid frequent adjustments, the delay does not occur. Does anyone know why this happens?

Additionally, does anyone know whether, when eight processes on a server simultaneously set frequencies for eight GPUs, the NVML operations are executed in parallel or sequentially?

Topic		Replies	Views
Which NVML API call do I use to set the SM clock of an NVIDIA GPU? System Management and Monitoring (NVML)	6	73	December 30, 2024
nvmlDeviceResetMemoryLockedClocks System Management and Monitoring (NVML)	0	947	August 1, 2021
Memory clock locking doesn't always lock memory clock System Management and Monitoring (NVML)	0	329	January 31, 2024
nvmlDeviceGetClockInfo function (NVML) does not work on my GPU CUDA Programming and Performance	0	1099	January 21, 2015
nvmlDeviceGetPowerUsage results fluctuate severely during periodic executions of fixed workloads System Management and Monitoring (NVML)	0	775	November 4, 2022
Applying memory clock offset breaks memory clock locking System Management and Monitoring (NVML)	0	359	February 3, 2024
NVML - issues System Management and Monitoring (NVML) nvml	0	827	May 30, 2023
nvmlDeviceGetMinMaxClockOfPState/nvmlDeviceSetClockOffsets issues System Management and Monitoring (NVML)	1	39	December 31, 2024
Nvml overhead on latency of applications running on the GPU System Management and Monitoring (NVML)	0	439	August 17, 2020
Unexpected results on Setting GPU Clocks with nvmlDeviceSetApplicationsClocks CUDA Programming and Performance	1	804	July 31, 2016

How does the frequency setting interface of NVML affect NCCL communication?

Related topics