FFmpeg cuvid decoder frame pts out of order

Hi.
We have issues with some videos when decoding with h264_cuvid decoder. Sometimes the frame pts goes backwards.
When using ffmpeg showinfo filter I get “Application provided invalid, non monotonically increasing dts to muxer in stream”.

Sorry for the long printout below

Running the command: ffmpeg -hide_banner -c:v h264_cuvid -i “D:\Movies\horses_Rodeo_m1.mp4” -vf showinfo -f null -
Outputs:
ffmpeg -hide_banner -c:v h264_cuvid -i “D:\Movies\horses_Rodeo_m1.mp4” -vf showinfo -f null -
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from ‘D:\Movies\horses_Rodeo_m1.mp4’:
Metadata:
major_brand : isom
minor_version : 512
compatible_brands: isomiso2avc1mp41
encoder : Lavf55.19.104
Duration: 00:11:21.15, start: 0.000000, bitrate: 2408 kb/s
Stream #0:0(und): Video: h264 (High) (avc1 / 0x31637661), yuv420p(tv, bt709), 1920x1080 [SAR 1:1 DAR 16:9], 2310 kb/s, 20 fps, 20 tbr, 10240 tbn, 40 tbc (default)
Metadata:
handler_name : VideoHandler
Stream #0:1(und): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, mono, fltp, 93 kb/s (default)
Metadata:
handler_name : SoundHandler
Stream mapping:
Stream #0:0 -> #0:0 (h264 (h264_cuvid) -> wrapped_avframe (native))
Stream #0:1 -> #0:1 (aac (native) -> pcm_s16le (native))
Press [q] to stop, [?] for help
[Parsed_showinfo_0 @ 000001E778E89000] config in time_base: 1/10240, frame_rate: 20/1
[Parsed_showinfo_0 @ 000001E778E89000] config out time_base: 0/0, frame_rate: 0/0
[Parsed_showinfo_0 @ 000001E778E89000] n: 0 pts: 0 pts_time:0 pos: -1 fmt:nv12 sar:1/1 s:1920x1080 i:P iskey:1 type:? checksum:9A20B5DA plane_checksum:[207042FB 794572DF] mean:[77 128] stdev:[32.1 6.9]
Output #0, null, to ‘pipe:’:
Metadata:
major_brand : isom
minor_version : 512
compatible_brands: isomiso2avc1mp41
encoder : Lavf57.71.100
Stream #0:0(und): Video: wrapped_avframe, nv12, 1920x1080 [SAR 1:1 DAR 16:9], q=2-31, 200 kb/s, 20 fps, 20 tbn, 20 tbc (default)
Metadata:
handler_name : VideoHandler
encoder : Lavc57.89.100 wrapped_avframe
Stream #0:1(und): Audio: pcm_s16le, 44100 Hz, mono, s16, 705 kb/s (default)
Metadata:
handler_name : SoundHandler
encoder : Lavc57.89.100 pcm_s16le
[Parsed_showinfo_0 @ 000001E778E89000] n: 1 pts: 512 pts_time:0.05 pos: -1 fmt:nv12 sar:1/1 s:1920x1080 i:P iskey:1 type:? checksum:1060D76F plane_checksum:[41CB6D1D D9806A52] mean:[77 128] stdev:[32.1 6.9]
[Parsed_showinfo_0 @ 000001E778E89000] n: 2 pts: 1024 pts_time:0.1 pos: -1 fmt:nv12 sar:1/1 s:1920x1080 i:P iskey:1 type:? checksum:5583DBB3 plane_checksum:[2DB5702C 0ECE6B87] mean:[77 128] stdev:[32.1 6.9]
[Parsed_showinfo_0 @ 000001E778E89000] n: 3 pts: 1536 pts_time:0.15 pos: -1 fmt:nv12 sar:1/1 s:1920x1080 i:P iskey:1 type:? checksum:87ECB745 plane_checksum:[F168479B 0C496FAA] mean:[77 128] stdev:[32.1 6.9]
[Parsed_showinfo_0 @ 000001E778E89000] n: 4 pts: 2048 pts_time:0.2 pos: -1 fmt:nv12 sar:1/1 s:1920x1080 i:P iskey:1 type:? checksum:A146A2D5 plane_checksum:[3B3629AA 50BA792B] mean:[77 128] stdev:[32.1 6.9]
[Parsed_showinfo_0 @ 000001E778E89000] n: 5 pts: 2560 pts_time:0.25 pos: -1 fmt:nv12 sar:1/1 s:1920x1080 i:P iskey:1 type:? checksum:2233A516 plane_checksum:[F81929DE 3C7C7B38] mean:[77 128] stdev:[32.1 6.9]
[Parsed_showinfo_0 @ 000001E778E89000] n: 6 pts: 3584 pts_time:0.35 pos: -1 fmt:nv12 sar:1/1 s:1920x1080 i:P iskey:1 type:? checksum:6602D1F7 plane_checksum:[DC664F3B AD7A82BC] mean:[77 128] stdev:[32.2 6.9]
[Parsed_showinfo_0 @ 000001E778E89000] n: 7 pts: 4096 pts_time:0.4 pos: -1 fmt:nv12 sar:1/1 s:1920x1080 i:P iskey:1 type:? checksum:9B13F32A plane_checksum:[01A46A48 FE9988E2] mean:[77 128] stdev:[32.1 6.9]
[Parsed_showinfo_0 @ 000001E778E89000] n: 8 pts: 3072 pts_time:0.3 pos: -1 fmt:nv12 sar:1/1 s:1920x1080 i:P iskey:1 type:? checksum:0D70EDEE plane_checksum:[AA175DA7 900D9047] mean:[77 128] stdev:[32.2 6.9]
[null @ 000001E77A866460] Application provided invalid, non monotonically increasing dts to muxer in stream 0: 8 >= 6
[Parsed_showinfo_0 @ 000001E778E89000] n: 9 pts: 4608 pts_time:0.45 pos: -1 fmt:nv12 sar:1/1 s:1920x1080 i:P iskey:1 type:? checksum:B2CDEAFC plane_checksum:[E18A64C9 7CF58633] mean:[77 128] stdev:[32.2 6.9]
[Parsed_showinfo_0 @ 000001E778E89000] n: 10 pts: 5120 pts_time:0.5 pos: -1 fmt:nv12 sar:1/1 s:1920x1080 i:P iskey:1 type:? checksum:950CD5B4 plane_checksum:[2F144A6D FA4E8B47] mean:[77 128] stdev:[32.2 6.9]
[Parsed_showinfo_0 @ 000001E778E89000] n: 11 pts: 5632 pts_time:0.55 pos: -1 fmt:nv12 sar:1/1 s:1920x1080 i:P iskey:1 type:? checksum:748DC381 plane_checksum:[F12B390B 99A88A76] mean:[77 128] stdev:[32.2 6.9]
[Parsed_showinfo_0 @ 000001E778E89000] n: 12 pts: 6144 pts_time:0.6 pos: -1 fmt:nv12 sar:1/1 s:1920x1080 i:P iskey:1 type:? checksum:BA21B673 plane_checksum:[C8742761 DCAF8F12] mean:[77 128] stdev:[32.2 6.9]
[Parsed_showinfo_0 @ 000001E778E89000] n: 13 pts: 6656 pts_time:0.65 pos: -1 fmt:nv12 sar:1/1 s:1920x1080 i:P iskey:1 type:? checksum:62D591E1 plane_checksum:[A1C1032C B8268EB5] mean:[77 128] stdev:[32.1 6.9]
[Parsed_showinfo_0 @ 000001E778E89000] n: 14 pts: 7168 pts_time:0.7 pos: -1 fmt:nv12 sar:1/1 s:1920x1080 i:P iskey:1 type:? checksum:113B60A0 plane_checksum:[1780CDD6 94AE92BB] mean:[77 128] stdev:[32.1 6.9]
[Parsed_showinfo_0 @ 000001E778E89000] n: 15 pts: 7680 pts_time:0.75 pos: -1 fmt:nv12 sar:1/1 s:1920x1080 i:P iskey:1 type:? checksum:D3AA3926 plane_checksum:[2D24A654 870D92C3] mean:[77 128] stdev:[32.1 6.9]
[Parsed_showinfo_0 @ 000001E778E89000] n: 16 pts: 8192 pts_time:0.8 pos: -1 fmt:nv12 sar:1/1 s:1920x1080 i:P iskey:1 type:? checksum:0B4C27A1 plane_checksum:[943D929C 8FF094F6] mean:[77 128] stdev:[32.1 6.9]
[Parsed_showinfo_0 @ 000001E778E89000] n: 17 pts: 9216 pts_time:0.9 pos: -1 fmt:nv12 sar:1/1 s:1920x1080 i:P iskey:1 type:? checksum:48D73DD3 plane_checksum:[24D2A3CF 748E99F5] mean:[77 128] stdev:[32.1 6.9]
[Parsed_showinfo_0 @ 000001E778E89000] n: 18 pts: 9728 pts_time:0.95 pos: -1 fmt:nv12 sar:1/1 s:1920x1080 i:P iskey:1 type:? checksum:EBE4309E plane_checksum:[EE909466 B9949C29] mean:[77 128] stdev:[32.1 6.9]
[Parsed_showinfo_0 @ 000001E778E89000] n: 19 pts: 8704 pts_time:0.85 pos: -1 fmt:nv12 sar:1/1 s:1920x1080 i:P iskey:1 type:? checksum:85C82943 plane_checksum:[29BB8930 C956A004] mean:[77 128] stdev:[32.1 6.9]
[null @ 000001E77A866460] Application provided invalid, non monotonically increasing dts to muxer in stream 0: 19 >= 17
[Parsed_showinfo_0 @ 000001E778E89000] n: 20 pts: 10752 pts_time:1.05 pos: -1 fmt:nv12 sar:1/1 s:1920x1080 i:P iskey:1 type:? checksum:874535CC plane_checksum:[11339655 596A9F68] mean:[77 128] stdev:[32.1 6.9]
[Parsed_showinfo_0 @ 000001E778E89000] n: 21 pts: 11264 pts_time:1.1 pos: -1 fmt:nv12 sar:1/1 s:1920x1080 i:P iskey:1 type:? checksum:2929306F plane_checksum:[C8B18C91 19B9A3CF] mean:[77 128] stdev:[32.1 6.9]
[Parsed_showinfo_0 @ 000001E778E89000] n: 22 pts: 10240 pts_time:1 pos: -1 fmt:nv12 sar:1/1 s:1920x1080 i:P iskey:1 type:? checksum:1F26257C plane_checksum:[461F8419 DC66A154] mean:[77 128] stdev:[32.1 6.9]
[null @ 000001E77A866460] Application provided invalid, non monotonically increasing dts to muxer in stream 0: 22 >= 20
[Parsed_showinfo_0 @ 000001E778E89000] n: 23 pts: 11776 pts_time:1.15 pos: -1 fmt:nv12 sar:1/1 s:1920x1080 i:P iskey:1 type:? checksum:68C826F3 plane_checksum:[3F75765E 4DCCB086] mean:[77 128] stdev:[32.1 6.9]
[Parsed_showinfo_0 @ 000001E778E89000] n: 24 pts: 12288 pts_time:1.2 pos: -1 fmt:nv12 sar:1/1 s:1920x1080 i:P iskey:1 type:? checksum:FC66E4E0 plane_checksum:[53F93E47 AC56A699] mean:[77 128] stdev:[32.1 6.9]

When running with h264 decoder I get the following ouptut:
ffmpeg -hide_banner -i “D:\Movies\horses_Rodeo_m1.mp4” -vf showinfo -f null -
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from ‘D:\Movies\horses_Rodeo_m1.mp4’:
Metadata:
major_brand : isom
minor_version : 512
compatible_brands: isomiso2avc1mp41
encoder : Lavf55.19.104
Duration: 00:11:21.15, start: 0.000000, bitrate: 2408 kb/s
Stream #0:0(und): Video: h264 (High) (avc1 / 0x31637661), yuv420p(tv, bt709), 1920x1080 [SAR 1:1 DAR 16:9], 2310 kb/s, 20 fps, 20 tbr, 10240 tbn, 40 tbc (default)
Metadata:
handler_name : VideoHandler
Stream #0:1(und): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, mono, fltp, 93 kb/s (default)
Metadata:
handler_name : SoundHandler
Stream mapping:
Stream #0:0 -> #0:0 (h264 (native) -> wrapped_avframe (native))
Stream #0:1 -> #0:1 (aac (native) -> pcm_s16le (native))
Press [q] to stop, [?] for help
[Parsed_showinfo_0 @ 0000020E6F3A4CA0] config in time_base: 1/10240, frame_rate: 20/1
[Parsed_showinfo_0 @ 0000020E6F3A4CA0] config out time_base: 0/0, frame_rate: 0/0
[Parsed_showinfo_0 @ 0000020E6F3A4CA0] n: 0 pts: 0 pts_time:0 pos: 48 fmt:yuv420p sar:1/1 s:1920x1080 i:P iskey:1 type:I checksum:30CAB5DA plane_checksum:[207042FB 013C17D8 68EA5B07] mean:[77 123 134] stdev:[32.1 4.4 3.9]
Output #0, null, to ‘pipe:’:
Metadata:
major_brand : isom
minor_version : 512
compatible_brands: isomiso2avc1mp41
encoder : Lavf57.71.100
Stream #0:0(und): Video: wrapped_avframe, yuv420p, 1920x1080 [SAR 1:1 DAR 16:9], q=2-31, 200 kb/s, 20 fps, 20 tbn, 20 tbc (default)
Metadata:
handler_name : VideoHandler
encoder : Lavc57.89.100 wrapped_avframe
Stream #0:1(und): Audio: pcm_s16le, 44100 Hz, mono, s16, 705 kb/s (default)
Metadata:
handler_name : SoundHandler
encoder : Lavc57.89.100 pcm_s16le
[Parsed_showinfo_0 @ 0000020E6F3A4CA0] n: 1 pts: 512 pts_time:0.05 pos: 307726 fmt:yuv420p sar:1/1 s:1920x1080 i:P iskey:0 type:P checksum:267DD76F plane_checksum:[41CB6D1D 60CA0D2A 3A8A5D28] mean:[77 123 134] stdev:[32.1 4.4 3.9]
[Parsed_showinfo_0 @ 0000020E6F3A4CA0] n: 2 pts: 1024 pts_time:0.1 pos: 313036 fmt:yuv420p sar:1/1 s:1920x1080 i:P iskey:0 type:P checksum:CE72DBB3 plane_checksum:[2DB5702C 8AC3110C 29DA5A7B] mean:[77 123 134] stdev:[32.1 4.4 3.9]
[Parsed_showinfo_0 @ 0000020E6F3A4CA0] n: 3 pts: 1536 pts_time:0.15 pos: 324791 fmt:yuv420p sar:1/1 s:1920x1080 i:P iskey:0 type:P checksum:4F06B745 plane_checksum:[F168479B 81FD1230 32DD5D7A] mean:[77 123 134] stdev:[32.1 4.4 3.9]
[Parsed_showinfo_0 @ 0000020E6F3A4CA0] n: 4 pts: 2048 pts_time:0.2 pos: 334713 fmt:yuv420p sar:1/1 s:1920x1080 i:P iskey:0 type:P checksum:D974A2D5 plane_checksum:[3B3629AA E4BC19D1 733F5F5A] mean:[77 123 134] stdev:[32.1 4.5 3.9]
[Parsed_showinfo_0 @ 0000020E6F3A4CA0] n: 5 pts: 2560 pts_time:0.25 pos: 344230 fmt:yuv420p sar:1/1 s:1920x1080 i:P iskey:0 type:P checksum:0C98A516 plane_checksum:[F81929DE 68001BB2 E5F25F86] mean:[77 123 134] stdev:[32.1 4.5 3.9]
[Parsed_showinfo_0 @ 0000020E6F3A4CA0] n: 6 pts: 3584 pts_time:0.35 pos: 373890 fmt:yuv420p sar:1/1 s:1920x1080 i:P iskey:0 type:B checksum:1725D1F7 plane_checksum:[DC664F3B 8E95226B 78496051] mean:[77 123 134] stdev:[32.2 4.5 3.9]
[Parsed_showinfo_0 @ 0000020E6F3A4CA0] n: 7 pts: 4096 pts_time:0.4 pos: 376781 fmt:yuv420p sar:1/1 s:1920x1080 i:P iskey:0 type:B checksum:2714F32A plane_checksum:[01A46A48 32C92437 7ED964AB] mean:[77 123 134] stdev:[32.1 4.5 3.9]
[Parsed_showinfo_0 @ 0000020E6F3A4CA0] n: 8 pts: 4608 pts_time:0.45 pos: 354165 fmt:yuv420p sar:1/1 s:1920x1080 i:P iskey:0 type:P checksum:8C6CEDEE plane_checksum:[AA175DA7 B445263F 48BE6A08] mean:[77 123 134] stdev:[32.2 4.5 3.9]
[Parsed_showinfo_0 @ 0000020E6F3A4CA0] n: 9 pts: 5120 pts_time:0.5 pos: 379668 fmt:yuv420p sar:1/1 s:1920x1080 i:P iskey:0 type:P checksum:2754EAFC plane_checksum:[E18A64C9 A44F1F2C CDA06707] mean:[77 123 134] stdev:[32.2 4.5 3.9]
[Parsed_showinfo_0 @ 0000020E6F3A4CA0] n: 10 pts: 5632 pts_time:0.55 pos: 388750 fmt:yuv420p sar:1/1 s:1920x1080 i:P iskey:0 type:P checksum:EFBCD5B4 plane_checksum:[2F144A6D 04A82146 2D876A01] mean:[77 123 134] stdev:[32.2 4.5 3.9]
[Parsed_showinfo_0 @ 0000020E6F3A4CA0] n: 11 pts: 6144 pts_time:0.6 pos: 397591 fmt:yuv420p sar:1/1 s:1920x1080 i:P iskey:0 type:P checksum:ADDAC381 plane_checksum:[F12B390B C22623C0 BDFA66B6] mean:[77 123 134] stdev:[32.2 4.5 3.9]
[Parsed_showinfo_0 @ 0000020E6F3A4CA0] n: 12 pts: 6656 pts_time:0.65 pos: 408018 fmt:yuv420p sar:1/1 s:1920x1080 i:P iskey:0 type:P checksum:0608B673 plane_checksum:[C8742761 498B2385 5A936B8D] mean:[77 123 134] stdev:[32.2 4.5 3.9]
[Parsed_showinfo_0 @ 0000020E6F3A4CA0] n: 13 pts: 7168 pts_time:0.7 pos: 417408 fmt:yuv420p sar:1/1 s:1920x1080 i:P iskey:0 type:P checksum:C29991E1 plane_checksum:[A1C1032C 8A062020 89506E95] mean:[77 123 134] stdev:[32.1 4.5 3.9]
[Parsed_showinfo_0 @ 0000020E6F3A4CA0] n: 14 pts: 7680 pts_time:0.75 pos: 426872 fmt:yuv420p sar:1/1 s:1920x1080 i:P iskey:0 type:P checksum:603F60A0 plane_checksum:[1780CDD6 C43E1E72 40367449] mean:[77 123 134] stdev:[32.1 4.5 3.9]
[Parsed_showinfo_0 @ 0000020E6F3A4CA0] n: 15 pts: 8192 pts_time:0.8 pos: 435789 fmt:yuv420p sar:1/1 s:1920x1080 i:P iskey:0 type:P checksum:887B3926 plane_checksum:[2D24A654 3DC91BDB C12A76E8] mean:[77 123 134] stdev:[32.1 4.5 3.9]
[Parsed_showinfo_0 @ 0000020E6F3A4CA0] n: 16 pts: 8704 pts_time:0.85 pos: 444255 fmt:yuv420p sar:1/1 s:1920x1080 i:P iskey:0 type:P checksum:C0E727A1 plane_checksum:[943D929C 5D871D5D A6367799] mean:[77 123 134] stdev:[32.1 4.5 3.9]
[Parsed_showinfo_0 @ 0000020E6F3A4CA0] n: 17 pts: 9216 pts_time:0.9 pos: 479775 fmt:yuv420p sar:1/1 s:1920x1080 i:P iskey:0 type:B checksum:57613DD3 plane_checksum:[24D2A3CF B0051DC0 48557C35] mean:[77 123 134] stdev:[32.1 4.5 3.9]
[Parsed_showinfo_0 @ 0000020E6F3A4CA0] n: 18 pts: 9728 pts_time:0.95 pos: 482967 fmt:yuv420p sar:1/1 s:1920x1080 i:P iskey:0 type:B checksum:0B2F309E plane_checksum:[EE909466 F27D1F51 A8AA7CD8] mean:[77 123 134] stdev:[32.1 4.5 3.9]
[Parsed_showinfo_0 @ 0000020E6F3A4CA0] n: 19 pts: 10240 pts_time:1 pos: 460439 fmt:yuv420p sar:1/1 s:1920x1080 i:P iskey:0 type:P checksum:1ADB2943 plane_checksum:[29BB8930 2D3C1EFC 77F38108] mean:[77 123 134] stdev:[32.1 4.5 3.9]
[Parsed_showinfo_0 @ 0000020E6F3A4CA0] n: 20 pts: 10752 pts_time:1.05 pos: 504829 fmt:yuv420p sar:1/1 s:1920x1080 i:P iskey:0 type:B checksum:A30535CC plane_checksum:[11339655 456B2075 A6BC7EF3] mean:[77 123 134] stdev:[32.1 4.5 3.9]
[Parsed_showinfo_0 @ 0000020E6F3A4CA0] n: 21 pts: 11264 pts_time:1.1 pos: 508056 fmt:yuv420p sar:1/1 s:1920x1080 i:P iskey:0 type:B checksum:A18A306F plane_checksum:[C8B18C91 908C22B0 BCD1811F] mean:[77 123 134] stdev:[32.1 4.5 3.9]
[Parsed_showinfo_0 @ 0000020E6F3A4CA0] n: 22 pts: 11776 pts_time:1.15 pos: 486481 fmt:yuv420p sar:1/1 s:1920x1080 i:P iskey:0 type:P checksum:20CB257C plane_checksum:[461F8419 7E181C64 329384F0] mean:[77 123 134] stdev:[32.1 4.5 3.9]
[Parsed_showinfo_0 @ 0000020E6F3A4CA0] n: 23 pts: 12288 pts_time:1.2 pos: 511303 fmt:yuv420p sar:1/1 s:1920x1080 i:P iskey:0 type:P checksum:0D4126F3 plane_checksum:[3F75765E 27992424 457E8C62] mean:[77 123 134] stdev:[32.1 4.5 3.9]
[Parsed_showinfo_0 @ 0000020E6F3A4CA0] n: 24 pts: 12800 pts_time:1.25 pos: 527932 fmt:yuv420p sar:1/1 s:1920x1080 i:P iskey:0 type:P checksum:8DD0E4E0 plane_checksum:[53F93E47 00E9192D 9BF88D6C] mean:[77 123 134] stdev:[32.1 4.5 3.9]

I am using Windows 10, GTX1070, Cuda 8.0.44 and FFmpeg 3.3

Provide a link to a sample source stream if you want us to look at it. You could have discontinuities in the stream. It is not that unusual.

PM sent

I strongly doubt that this is anything to do with CUVID. First, PTS/DTS are container data and CUVID accepts only elementary streams. Second, you can pass a “timestamp” field when you inject a packet of data and it can be retrieved in HandlePictureDisplay() pPicParams->timestamp, but this is a transparent passthrough of the passed timestamp. Finally, a google search shows a lot of applications running into this error both with and without CUVID. I suggest you post a bug with FFMPEG.

Which version of ffmpeg?

There is rewrite of cuvid decoder between 3.3 and 3.4, and there is some bug since 3.4