HEVC decoding error with FFMPEG

Having built ffmpeg with nvidia codec, I am randomly getting these errors upon startup of the video streams:

[hevc_cuvid @ 0x558a37ff7f00] ctx->cvdl->cuvidDecodePicture(ctx->cudecoder, picparams) failed → CUDA_ERROR_LAUNCH_FAILED: unspecified launch failure
[hevc_cuvid @ 0x558a37ff7f00] cuvid decode callback error

I am simultaneously running multiple stream decoding both h264 and h265 on an RTX 2070 with the latest CUDA11.1/CuDNN8.05 nv-codec-headers and driver 455.45.01 on ubuntu18 and only the hevc streams have this error which crashes the stream decoding launch. Once it is launched, it will run until I kill it. Out of the 3 streams, I see cases where zero, one, two or all three streams would fail with this error. This has been going on for quite some time now so it’s been a few versions of drivers/cuda and codec. It’s been quite annoying. Any insight as to how to fix this?

Hi,
Sorry, it is impossible to guess what could be going wrong. Will it be possible for you to provide reproducer with detailed instructions and command line to help reproduce this issue and debug?

Thanks.

Hi, Thanks for following up. It is quite challenging because my setup and use case are both very unique and also because the issue is very inconsistent.

The program I use is fairly complex. It is based on Home Assistant. A home automation project written on python. I am using ffmpeg with the latest nvidia driver, cuda and cudnn version but the focus should be on cuda. ffmpeg is actually embedded into opencv, which I also compiled myself. The inconsistency of the issue is leading me to think that it is a command timing issue. Within the one single python script, I am launching 3 hevc and 5 h264 stream decoding processes. The python script has a built in retry loop which eventually crashes the entire program when the retries do not succeed. The error seems to put the gpu decoder in a bad state: If it fails once, it will continue to fail for some time, maybe a minute or so. Eventually I am seeing about 1 out of 3 startup being successful but the error only occurs on an hevc launch. Never on an h264 stream. Also as I said, once it is launched, it never crashes.

An update to this problem. I have now moved to an RTX3070 with driver 460.27.04, cuda 11.2, CuDNN8.05 and am still seeing the same problem but with a much lower probability. Failures have gone from 60% of my program starts to less than 20% but they still occur. I think it is an error handling issue with nvdec/cuda specific to the hevc decoder.