NVDEC in FFMPEG (cuvid) drops frames when using deinterlacing (deint 2)

malakudi · October 4, 2018, 8:33am

Recent ffmpeg versions fail to use the deinterlacer of nvdec (cuvid). Last known version to work OK is 3.3.8. Newer versions drop frames.
Example command, input.ts is a 1080i50 file:
ffmpeg -hwaccel cuvid -c:v h264_cuvid -deint adaptive -resize 1280x720 -f mpegts -i input.ts -vcodec h264_nvenc -preset slow -c:a copy -r 50 -f mpegts output.ts

With 3.3.8 everything works fine, with 3.4 and above we have dropped frames.

Example output

frame= 1986 fps=639 q=33.0 Lsize= 13142kB time=00:02:30.52 bitrate= 715.2kbits/s dup=0 drop=1798 speed=48.4x

And output is unplayable.

With ffmpeg 3.3.8 WITHOUT the -r 50 option in output, we also have drops. But with -r 50 in output, we have working file.

Should I report this to ffmpeg? Or is it a cuvid issue that makes it incompatible with recent ffmpeg versions?

malakudi · October 4, 2018, 1:25pm

A ticket in ffmpeg trac system is already open about this issue.
https://trac.ffmpeg.org/ticket/6971

I have triggered the issue to be the following ffmpeg commit

https://git.ffmpeg.org/gitweb/ffmpeg.git/commit/bddb2343b6e594e312dadb5d21b408702929ae04

commit ‘061a0c14bb5767bca72e3a7227ca400de439ba09’:
decode: restructure the core decoding code

CUVID decoder adapted by wm4.

Maybe you should help the ffmpeg devs to fix this.

rypark · October 5, 2018, 1:43am

Hi malakudi,

Could you give us some more additional information:

GPU
Video Codec SDK version being used
Driver version

Thanks,
Ryan Park

malakudi · October 5, 2018, 7:15am

Pascal GPU, tried Quadro P2000, GTX 1070 Ti. Both show same result.
Tried 8.0 and 8.1 - older ffmpeg versions with 8.0, newer with 8.1
384.xx and 390.xx, on Linux

The commit that triggers the issue in ffmpeg code is a redesign of ffmpeg core decode code. cuvid code was “adapted” to the new core decode code by coder named “wm4”. Don’t know if he/she is related with NVidia or is ffmpeg developer. But the issue is in this redesign. This redesign affects versions from 3.4 and above, since it was implement in master branch after the 3.3 release was cut from master.

As it is, I can’t use the cuvid deinterlacer with ffmpeg versions from 3.4 and above and forced to use 3.3.x versions.

Thunderm · October 5, 2018, 4:49pm

You should set -drop_second_field parameter to 1 ;)

malakudi · October 5, 2018, 5:18pm

Of course not. And if you suggest that you don’t care about the quality of your encodes. This is not real deinterlacing, you miss half the movement detail. It is shown really well in fast moving scenes. You can’t have the fluidity of the movement 50i shows when converting to 25p. Only conversion to 50p keeps all fluidity.

I want a real deinterlacing with the default option of -drop_second_field to 0. This creates a 50p output with all the detail.

The issue of course is only shown when converting to 50p (or 60p for 1080i60), with -drop_second_field set to 0, which is also the default option.

rypark · October 5, 2018, 7:09pm

Hi Malakudi,

Which OS are you using specifically?

Thanks,
Ryan Park

malakudi · October 5, 2018, 7:13pm

Linux, Debian 9

I can test in Windows 10 if you wish.

malakudi · October 8, 2018, 8:57am

Tried in Windows 10, same behaviour.

malakudi · October 21, 2018, 10:10am

Is any dev looking at this? Other h/w accelerated decoders use a filter for deinterlacing (deinterlace_qsv, deinterlace_vaapi) so they are not affected by “decode: restructure the core decoding code” change in ffmpeg.
We really need this option working, the only solution that works now is to either keep using old ffmpeg 3.3.8 (which has other issues) or do the deinterlacing with yadif, but then you have unnecessary transfers between main memory and gpu memory.

oviano · November 13, 2018, 11:34pm

I was looking at this earlier as I was also trying to get cuvid deint working, and I don’t get the same problem.

I started with a directshow capture file, capturing a 1080i50 signal, in yuyv422 format using the ffvhuff container.

I then created a lossless h264 file from it (I could have done this directly as part of the initial capture, but didn’t):

ffmpeg -i capture_ffvhuff.asf -vcodec libx264 -pix_fmt nv12 -preset ultrafast -crf 0 -g 1 -flags +ildct+ilme -top 1 -acodec copy -f asf capture_h264.asf -y

Then the yadif version of the deinterlace is as follows:
ffmpeg -i capture_h264.asf -vcodec hevc_nvenc -preset lossless -vf yadif=1 -acodec aac -ab 192k -r 50 -f mp4 encode_yadif_deint.mp4 -y

Finally, the cuvid deint:
ffmpeg -vcodec h264_cuvid -deint adaptive -i capture_h264.asf -vcodec hevc_nvenc -preset lossless -acodec aac -ab 192k -r 50 -f mp4 encode_cuvid_deint.mp4 -y

This is with FFmpeg built using jb’s suite from the 4.1 branch. This doesn’t drop any frames, however it took a while to figure out the correct flags to use to create the h264 file: “-flags +ildct+ilme -top 1”.

Without these flags I found that cuvid_deint presumably did not recognise it as interlaced, and so I think either the deint did nothing, or just dropped half the frames.

oviano · November 13, 2018, 11:44pm

By the way, there is another alternative in that there is now a yadif_cuda filter if you build your FFmpeg with cuda-sdk.

You can do this:

ffmpeg -init_hw_device cuda=0 -i capture_ffvhuff.asf -filter_hw_device 0 -vcodec hevc_nvenc -pix_fmt cuda -preset lossless -vf format=nv12,hwupload,yadif_cuda=1 -acodec aac -ab 192k -r 50 -f mp4 encode_yadif_cuda_deint.mp4

Obviously in the above example I’m uploading frames from system memory to the GPU, but if you were already decoding in the GPU then you could use yadif_cuda directly.

malakudi · November 14, 2018, 8:32am

@oviano: Thank you, will check it asap, although the original issue still stands.

oviano · November 14, 2018, 8:56am

I wonder - are you able to remux your input.ts file making sure that the flags -flags +ildct+ilme -top 1 are present?

It’s just that I had the same issue as you with cuvid deint until I made sure my file was correctly flagged. Yadif worked regardless, but maybe cuvid deint is fussier?

Just a thought anyway. Probably your issue is different!

malakudi · November 14, 2018, 9:12am

yadif_cuda works great. Thank you for letting me know about it.

Remuxing input.ts is not an option for my workload, I transcode in real time live streams (http or udp multicast). However, input is correctly flagged as interlaced, cuvid’s option deint adaptive actually does deinterlace, the problem is that after ffmpeg 3.3.8 changes, -r 50 on output doesn’t work on most inputs and drops frames (I have find some inputs that work) and -r 50 as input does not drop frames, but if there is a problem in input stream (corrupted packet or something which can frequently happen in live streams) then it leads to video/audio sync issues. With ffmpeg 3.3.8 which is quite old now, -r 50 as output does not drop frames and works fine.

oviano · November 14, 2018, 9:17am

Ah right, so it depends on the input, hopefully they will fix it then.

I’m glad the cuda version of yadif is working for you.

malakudi · November 14, 2018, 9:25am

Your reply made me try to find out which inputs have problem with -r 50 as output on newer ffmpeg versions when using cuvid’s -deint adaptive, and which do not.
It seems the problem happens when input is encoded as MBAFF. If input is encoded as PAFF, then -r 50 as output works fine!

Following file has problem with -r 50 as output, it drops frames.

mediainfo input1.ts 
General
ID                                       : 30204 (0x75FC)
Complete name                            : test3.ts
Format                                   : MPEG-TS
File size                                : 22.9 MiB
Duration                                 : 36 s 112 ms
Overall bit rate mode                    : Variable
Overall bit rate                         : 5 285 kb/s

Video
ID                                       : 251 (0xFB)
Menu ID                                  : 1451 (0x5AB)
Format                                   : AVC
Format/Info                              : Advanced Video Codec
Format profile                           : High@L4
Format settings, CABAC                   : Yes
Format settings, ReFrames                : 4 frames
Codec ID                                 : 27
Duration                                 : 35 s 680 ms
Bit rate                                 : 4 830 kb/s
Width                                    : 1 920 pixels
Height                                   : 1 080 pixels
Display aspect ratio                     : 16:9
Frame rate                               : 25.000 FPS
Standard                                 : Component
Color space                              : YUV
Chroma subsampling                       : 4:2:0
Bit depth                                : 8 bits
Scan type                                : MBAFF
Scan type, store method                  : Interleaved fields
Scan order                               : Top Field First
Bits/(Pixel*Frame)                       : 0.093
Stream size                              : 20.5 MiB (90%)
Color range                              : Limited
Color primaries                          : BT.709
Transfer characteristics                 : BT.709
Matrix coefficients                      : BT.709

while this one does not

mediainfo test2.ts 
General
ID                                       : 5 (0x5)
Complete name                            : test2.ts
Format                                   : MPEG-TS
File size                                : 24.4 MiB
Duration                                 : 46 s 632 ms
Overall bit rate mode                    : Variable
Overall bit rate                         : 4 374 kb/s

Video
ID                                       : 408 (0x198)
Menu ID                                  : 1008 (0x3F0)
Format                                   : AVC
Format/Info                              : Advanced Video Codec
Format version                           : Version 2
Format profile                           : Main@L4
Format settings, CABAC                   : Yes
Format settings, ReFrames                : 4 frames
Codec ID                                 : 27
Duration                                 : 46 s 440 ms
Bit rate                                 : 3 770 kb/s
Maximum bit rate                         : 5 044 kb/s
Width                                    : 1 920 pixels
Height                                   : 1 080 pixels
Display aspect ratio                     : 16:9
Frame rate                               : 25.000 FPS
Color space                              : YUV
Chroma subsampling                       : 4:2:0
Bit depth                                : 8 bits
Scan type                                : Interlaced
Scan type, store method                  : Separated fields
Scan order                               : Top Field First
Bits/(Pixel*Frame)                       : 0.073
Stream size                              : 20.9 MiB (86%)
Color range                              : Limited
Color primaries                          : BT.709
Transfer characteristics                 : BT.709
Matrix coefficients                      : BT.709

Thunderm · November 14, 2018, 9:28am

Could you try adding -r 50 before input?

ffmpeg -r 50 -i input_25i.ts …

malakudi · November 14, 2018, 9:31am

As I already have mentioned, -r 50 on input does not drop frames, but leads to video/audio sync errors if input has any error - and errors can happen when transcoding live streams which is my workload.

Thunderm · November 14, 2018, 12:38pm

So maybe there is just simple BUG in ffmpeg cuvid decoder which should report double frame rate for deinterlaced source.

Topic		Replies	Views
Debugging slow NVDEC h264 decoding using FFMPEG -- time is spent in avcodec_send_packet() Video Processing & Optical Flow	8	647	September 30, 2024
NVIDIA FFmpeg Transcoding Guide Technical Blog	24	5171	June 21, 2022
Video decoding: Video Processing & Optical Flow	5	4064	October 12, 2021
FFmpeg cuvid decoder frame pts out of order Video Processing & Optical Flow	4	2581	February 10, 2018
Problem about Accelerated Decode with ffmpeg Jetson Xavier NX ffmpeg , chinese	8	2197	November 15, 2022
Possible multimedia api regression with decode interlace source Jetson TX2 mmapi , nvbugs	28	2677	February 9, 2022
[Linux] NVCuvid - Performarce CUDA Programming and Performance	13	4040	March 9, 2016
ffmpeg failed at encoding on Tesla T4 card Video Processing & Optical Flow	2	2883	December 28, 2019
Decoder SDK - how to access decoded frame? Video Processing & Optical Flow	28	8313	August 13, 2019
nvjpegdec slower then jpegdec in gstreamer Jetson AGX Xavier	38	8499	October 18, 2021

NVDEC in FFMPEG (cuvid) drops frames when using deinterlacing (deint 2)

Related topics