Possible reasons why cuvidDecodePicture may block for 20 seconds when decoding HEVC?

svechnikov66 · April 24, 2019, 10:10am

Our situation is the following:

We transcode 2 HEVC (4K) video streams using ffmpeg and nvidia.

Sometimes (a few times a day or in a couple of days) we observe an error - “Circular buffer overrun. To avoid, increase fifo_size URL option. To survive in such case, use overrun_nonfatal option”.

Right before the error we observe that the speed of transcoding drops drastically:

Apr 24 07:50:14 frame=4537333 fps= 50 q=-1.0 q=34.0 size=N/A time=25:12:28.06 bitrate=N/A dup=1 drop=0
Apr 24 07:50:15 frame=4537359 fps= 50 q=-1.0 q=34.0 size=N/A time=25:12:28.58 bitrate=N/A dup=1 drop=0
Apr 24 07:50:36 frame=4537369 fps= 50 q=-1.0 q=34.0 size=N/A time=25:12:28.78 bitrate=N/A dup=1 drop=0
Apr 24 07:50:36 frame=4537369 fps= 50 q=-1.0 q=34.0 size=N/A time=25:12:28.78 bitrate=N/A dup=1 drop=0
Apr 24 07:50:57 frame=4537371 fps= 50 q=-1.0 q=34.0 size=N/A time=25:12:28.82 bitrate=N/A dup=1 drop=0
Apr 24 07:51:07 frame=4537372 fps= 50 q=-1.0 q=34.0 size=N/A time=25:12:28.84 bitrate=N/A dup=1 drop=0

Pay attention to the second and third lines - it took 21 second (07:50:36 - 07:50:15 = 21 second) to transcode 10 frames (4537369 - 4537359).
If you look at the 5 and 6 lines, you will see that it took 10 seconds to transcode just 1 frame.

And it happened all of a sudden - as you can see in the logs, before that there had been 25 hours of successful transcoding and no errors/warnings.

I tracked down the culprit - the process is blocked by function cuvidDecodePicture (https://github.com/FFmpeg/FFmpeg/blob/master/libavcodec/cuviddec.c#L339). But I am unable to investigate any further, because I don’t have the sources of this function. Is there by any chance somebody who was faced with the same issue?

More details:
Graphic card - GTX 1080/GTX 1050 (the error manifests itself no matter which card we’re using)
Driver version - 418.56 (we tried drivers 3xx.xx, didn’t help)
Docker as runtime environment (we tried nvidia images with versions 9.2-devel-ubuntu16.04 and 10.1-devel-ubuntu18.04 (https://hub.docker.com/r/nvidia/cuda/))
As input we’re using multicast mpegts HEVC video. Here’s an example of ffmpeg command:

ffmpeg -y -xerror -scan_all_pmts 0
    -hwaccel cuvid -c:v hevc_cuvid
    -copyts -start_at_zero
    -i "udp://@225.0.0.1:1234?fifo_size=688128"
    -c:v hevc_nvenc
    -map 0:0 -map 0:0 -map 0:1
    -rc vbr
    -c:v:0 copy
    -qmin:v:1 21
    -qmax:v:1 35
    -b:v:1 8000000
    -maxrate:v:1 8800000
    -bufsize:v:1 4000000
    -filter:v:1 scale_cuda=w=1920:h=1080
    -g 250 -r 50
    -c:a:0 copy
    -f mpegts /dev/null

What’s interesting - is that the problem can’t be reproduced with the same video sample (after I reproduced the problem with multicast stream, I successfully transcoded the same sample (which I’d kept as an mpegts file on file system)).

What’s even more interesting, today we reproduced the same error at the same time on two servers, which were transcoding the same streams in parallel.

What could cause the problem?

svechnikov.sergey · June 11, 2019, 5:29am

Seems it got fixed in the latest stable driver (430.14).There were no hangs in the last several days since I started transcoding with the new driver.

Topic		Replies	Views
Out of order frames from NVDEC Video Processing & Optical Flow	6	2385	October 12, 2021
Video Transcoding using multiple GPUs (32 live streaming jobs) Video Processing & Optical Flow	5	2116	October 12, 2021
Filter complex through CUDA hevc_cuvid with FFMPEG input 4k hevc, help needed Linux	7	8657	May 15, 2021
Video SDK decoder or encoder have always 5 frames Buffer DPB buffer or some other frame buffer Video Processing & Optical Flow	10	2925	June 24, 2022
GPU performances using ffmpeg GPU-Accelerated Libraries cuda , ffmpeg	6	3363	May 5, 2021
Nvcuvid Decoding Slower with v537.42 Drivers Video Processing & Optical Flow	5	468	May 28, 2024
HW transcoding live stream, DAR/SAR source changes CUDA Programming and Performance	4	1421	November 6, 2018
Video decoder frames latency between first frame inserted and first frame extracted Video Processing & Optical Flow	9	3312	September 28, 2021
DeckLink Mini Recorder 4K streaming via NVidia GeForce GTX 1660 NVAPI	2	1281	October 17, 2019
FFMPEG CUVID decoding issue with P2000 Linux	1	1618	September 24, 2018

Possible reasons why cuvidDecodePicture may block for 20 seconds when decoding HEVC?

Related topics