OS: Windows server 2016
GPU: Tesla T4 and Tesla P4
Drivers: 411.98 and 412.36
We have ffmpeg with cuvid decoder enabled (the problem reproduces with other GPU decoders as well).
We run this command:
ffmpeg -c:v h264_cuvid -i <video file> -f null –
and observed the decoder utilization using this command
nvidia-smi.exe -q -l 1 | FINDSTR Decoder
Testing on a public video from:
The video is very short - create a video with x4 loop -
ffmpeg.exe -c:v h264_cuvid -stream_loop 4 -i video.mp4 video_loopX4.mp4
Comparing 2 driver versions:
Driver 411.98: ~358 fps, 87% decoder utilization
Driver 412.36: ~355 fps, 88% decoder utilization
Driver 411.98: ~434 fps, 34% decoder utilization
Driver 412.36: ~194 fps, 29% decoder utilization
Running the same test on Linux we are able to achieve 100% (P4) / 50% (T4) decoder utilization and much higher decode frame rate.
Tesla T4 - Linux (Ubuntu 18.04.1 LTS)
Driver 415.27: ~620 fps, 50% decoder utilization
Actual fps and decoder utilization vary when testing different input videos, but both GPUs are never able to achieve their decoding potential seen on Linux when using Windows server.
update: Tesla T4 decodes in lower fps after driver update.