The kernels run faster when running `watch -n 0.1 nvidia-smi`

louchenyao · October 27, 2020, 4:50am

I was benchmarking my kernels on V100 provided by Google Cloud, and found that it’s faster by 18% when running watch -n 0.1 nvidia-smi.

The persistency mode is enabled. The CUDA driver is installed through the official deb without any changes. Driver Version: 455.23.05, CUDA Version: 11.1

One benchmark involves several kernel calls, and the benchmark is run multiple times. So I assume the kernels are warmed up for the second run.

The time measured by CUDA events and CPU time separately. They are consisted.

The running time with watch -n 0.1 nvidia-smi is more closer to the time measured by ncu.

The kernels are all memory bandwidth bounded.

I know it sounds extremely weird but it indeed happened. I am wondering if anyone runs into the same problem or has any comments on this.

Topic		Replies	Views
Profile cuda kernel CUDA Programming and Performance	7	531	January 19, 2023
CUDA performance ubuntu 16.04 vs windows 7? CUDA Programming and Performance	0	576	November 4, 2016
Same kernel and data exhibits different performance CUDA Programming and Performance	3	499	December 3, 2021
Problem with performance with different Visual/CUDA versions CUDA Programming and Performance	8	1470	November 22, 2015
Unstable performance measured by cuda event CUDA Programming and Performance	3	463	December 6, 2022
GPU utilization for CUDA CUDA Programming and Performance	1	728	October 31, 2018
Nsight Compute Clock Speed During Profiling Nsight Compute	4	1804	March 31, 2022
Driver (510.47.03) for A100 Performance Regression Linux cuda	1	563	April 23, 2024
cudaLaunchKernel very slow? (Edit: The problem is with Nsight Systems.) CUDA Programming and Performance	1	351	January 4, 2024
CUDA kernel is 6x slower in model than in a separate benchmark CUDA Programming and Performance cuda , kernel	6	453	February 17, 2023

The kernels run faster when running `watch -n 0.1 nvidia-smi`

Related topics