NVDEC in FFMPEG (cuvid) drops frames when using deinterlacing (deint 2)

Recent ffmpeg versions fail to use the deinterlacer of nvdec (cuvid). Last known version to work OK is 3.3.8. Newer versions drop frames.
Example command, input.ts is a 1080i50 file:
ffmpeg -hwaccel cuvid -c:v h264_cuvid -deint adaptive -resize 1280x720 -f mpegts -i input.ts -vcodec h264_nvenc -preset slow -c:a copy -r 50 -f mpegts output.ts

With 3.3.8 everything works fine, with 3.4 and above we have dropped frames.

Example output

frame= 1986 fps=639 q=33.0 Lsize= 13142kB time=00:02:30.52 bitrate= 715.2kbits/s dup=0 drop=1798 speed=48.4x

And output is unplayable.

With ffmpeg 3.3.8 WITHOUT the -r 50 option in output, we also have drops. But with -r 50 in output, we have working file.

Should I report this to ffmpeg? Or is it a cuvid issue that makes it incompatible with recent ffmpeg versions?

A ticket in ffmpeg trac system is already open about this issue.
https://trac.ffmpeg.org/ticket/6971

I have triggered the issue to be the following ffmpeg commit

https://git.ffmpeg.org/gitweb/ffmpeg.git/commit/bddb2343b6e594e312dadb5d21b408702929ae04

  • commit ‘061a0c14bb5767bca72e3a7227ca400de439ba09’:
    decode: restructure the core decoding code

CUVID decoder adapted by wm4.

Maybe you should help the ffmpeg devs to fix this.

Hi malakudi,

Could you give us some more additional information:

  1. GPU
  2. Video Codec SDK version being used
  3. Driver version

Thanks,
Ryan Park

  1. Pascal GPU, tried Quadro P2000, GTX 1070 Ti. Both show same result.
  2. Tried 8.0 and 8.1 - older ffmpeg versions with 8.0, newer with 8.1
  3. 384.xx and 390.xx, on Linux

The commit that triggers the issue in ffmpeg code is a redesign of ffmpeg core decode code. cuvid code was “adapted” to the new core decode code by coder named “wm4”. Don’t know if he/she is related with NVidia or is ffmpeg developer. But the issue is in this redesign. This redesign affects versions from 3.4 and above, since it was implement in master branch after the 3.3 release was cut from master.

As it is, I can’t use the cuvid deinterlacer with ffmpeg versions from 3.4 and above and forced to use 3.3.x versions.

You should set -drop_second_field parameter to 1 ;)

Of course not. And if you suggest that you don’t care about the quality of your encodes. This is not real deinterlacing, you miss half the movement detail. It is shown really well in fast moving scenes. You can’t have the fluidity of the movement 50i shows when converting to 25p. Only conversion to 50p keeps all fluidity.

I want a real deinterlacing with the default option of -drop_second_field to 0. This creates a 50p output with all the detail.

The issue of course is only shown when converting to 50p (or 60p for 1080i60), with -drop_second_field set to 0, which is also the default option.

Hi Malakudi,

Which OS are you using specifically?

Thanks,
Ryan Park

Linux, Debian 9

I can test in Windows 10 if you wish.

Tried in Windows 10, same behaviour.

Is any dev looking at this? Other h/w accelerated decoders use a filter for deinterlacing (deinterlace_qsv, deinterlace_vaapi) so they are not affected by “decode: restructure the core decoding code” change in ffmpeg.
We really need this option working, the only solution that works now is to either keep using old ffmpeg 3.3.8 (which has other issues) or do the deinterlacing with yadif, but then you have unnecessary transfers between main memory and gpu memory.

I was looking at this earlier as I was also trying to get cuvid deint working, and I don’t get the same problem.

I started with a directshow capture file, capturing a 1080i50 signal, in yuyv422 format using the ffvhuff container.

I then created a lossless h264 file from it (I could have done this directly as part of the initial capture, but didn’t):

ffmpeg -i capture_ffvhuff.asf -vcodec libx264 -pix_fmt nv12 -preset ultrafast -crf 0 -g 1 -flags +ildct+ilme -top 1 -acodec copy -f asf capture_h264.asf -y

Then the yadif version of the deinterlace is as follows:
ffmpeg -i capture_h264.asf -vcodec hevc_nvenc -preset lossless -vf yadif=1 -acodec aac -ab 192k -r 50 -f mp4 encode_yadif_deint.mp4 -y

Finally, the cuvid deint:
ffmpeg -vcodec h264_cuvid -deint adaptive -i capture_h264.asf -vcodec hevc_nvenc -preset lossless -acodec aac -ab 192k -r 50 -f mp4 encode_cuvid_deint.mp4 -y

This is with FFmpeg built using jb’s suite from the 4.1 branch. This doesn’t drop any frames, however it took a while to figure out the correct flags to use to create the h264 file: “-flags +ildct+ilme -top 1”.

Without these flags I found that cuvid_deint presumably did not recognise it as interlaced, and so I think either the deint did nothing, or just dropped half the frames.

By the way, there is another alternative in that there is now a yadif_cuda filter if you build your FFmpeg with cuda-sdk.

You can do this:

ffmpeg -init_hw_device cuda=0 -i capture_ffvhuff.asf -filter_hw_device 0 -vcodec hevc_nvenc -pix_fmt cuda -preset lossless -vf format=nv12,hwupload,yadif_cuda=1 -acodec aac -ab 192k -r 50 -f mp4 encode_yadif_cuda_deint.mp4

Obviously in the above example I’m uploading frames from system memory to the GPU, but if you were already decoding in the GPU then you could use yadif_cuda directly.

@oviano: Thank you, will check it asap, although the original issue still stands.

I wonder - are you able to remux your input.ts file making sure that the flags -flags +ildct+ilme -top 1 are present?

It’s just that I had the same issue as you with cuvid deint until I made sure my file was correctly flagged. Yadif worked regardless, but maybe cuvid deint is fussier?

Just a thought anyway. Probably your issue is different!

yadif_cuda works great. Thank you for letting me know about it.

Remuxing input.ts is not an option for my workload, I transcode in real time live streams (http or udp multicast). However, input is correctly flagged as interlaced, cuvid’s option deint adaptive actually does deinterlace, the problem is that after ffmpeg 3.3.8 changes, -r 50 on output doesn’t work on most inputs and drops frames (I have find some inputs that work) and -r 50 as input does not drop frames, but if there is a problem in input stream (corrupted packet or something which can frequently happen in live streams) then it leads to video/audio sync issues. With ffmpeg 3.3.8 which is quite old now, -r 50 as output does not drop frames and works fine.

Ah right, so it depends on the input, hopefully they will fix it then.

I’m glad the cuda version of yadif is working for you.

Your reply made me try to find out which inputs have problem with -r 50 as output on newer ffmpeg versions when using cuvid’s -deint adaptive, and which do not.
It seems the problem happens when input is encoded as MBAFF. If input is encoded as PAFF, then -r 50 as output works fine!

Following file has problem with -r 50 as output, it drops frames.

mediainfo input1.ts 
General
ID                                       : 30204 (0x75FC)
Complete name                            : test3.ts
Format                                   : MPEG-TS
File size                                : 22.9 MiB
Duration                                 : 36 s 112 ms
Overall bit rate mode                    : Variable
Overall bit rate                         : 5 285 kb/s

Video
ID                                       : 251 (0xFB)
Menu ID                                  : 1451 (0x5AB)
Format                                   : AVC
Format/Info                              : Advanced Video Codec
Format profile                           : High@L4
Format settings, CABAC                   : Yes
Format settings, ReFrames                : 4 frames
Codec ID                                 : 27
Duration                                 : 35 s 680 ms
Bit rate                                 : 4 830 kb/s
Width                                    : 1 920 pixels
Height                                   : 1 080 pixels
Display aspect ratio                     : 16:9
Frame rate                               : 25.000 FPS
Standard                                 : Component
Color space                              : YUV
Chroma subsampling                       : 4:2:0
Bit depth                                : 8 bits
Scan type                                : MBAFF
Scan type, store method                  : Interleaved fields
Scan order                               : Top Field First
Bits/(Pixel*Frame)                       : 0.093
Stream size                              : 20.5 MiB (90%)
Color range                              : Limited
Color primaries                          : BT.709
Transfer characteristics                 : BT.709
Matrix coefficients                      : BT.709

while this one does not

mediainfo test2.ts 
General
ID                                       : 5 (0x5)
Complete name                            : test2.ts
Format                                   : MPEG-TS
File size                                : 24.4 MiB
Duration                                 : 46 s 632 ms
Overall bit rate mode                    : Variable
Overall bit rate                         : 4 374 kb/s

Video
ID                                       : 408 (0x198)
Menu ID                                  : 1008 (0x3F0)
Format                                   : AVC
Format/Info                              : Advanced Video Codec
Format version                           : Version 2
Format profile                           : Main@L4
Format settings, CABAC                   : Yes
Format settings, ReFrames                : 4 frames
Codec ID                                 : 27
Duration                                 : 46 s 440 ms
Bit rate                                 : 3 770 kb/s
Maximum bit rate                         : 5 044 kb/s
Width                                    : 1 920 pixels
Height                                   : 1 080 pixels
Display aspect ratio                     : 16:9
Frame rate                               : 25.000 FPS
Color space                              : YUV
Chroma subsampling                       : 4:2:0
Bit depth                                : 8 bits
Scan type                                : Interlaced
Scan type, store method                  : Separated fields
Scan order                               : Top Field First
Bits/(Pixel*Frame)                       : 0.073
Stream size                              : 20.9 MiB (86%)
Color range                              : Limited
Color primaries                          : BT.709
Transfer characteristics                 : BT.709
Matrix coefficients                      : BT.709

Could you try adding -r 50 before input?

ffmpeg -r 50 -i input_25i.ts …

As I already have mentioned, -r 50 on input does not drop frames, but leads to video/audio sync errors if input has any error - and errors can happen when transcoding live streams which is my workload.

So maybe there is just simple BUG in ffmpeg cuvid decoder which should report double frame rate for deinterlaced source.