Question: changes in buf.pts when nvv4l2decoder skip-frames=decode_key

• Hardware Platform: GPU
• DeepStream Version: 6.3
• TensorRT Version: 8.5.3
• NVIDIA GPU Driver Version: 550.54.15
• Issue Type: Questions/Bugs

Hello, I have noticed an interesting behaviour of nvv4l2decoder when using different values for skip-frames property. It is related to the pts values of the gstreamer buffers produced by the decoder element.

We are reading a sample video with a gstreamer pipeline while being interested only in the key-frames. The video should have a key-frame every 30 frames (that is ~1.2 seconds), but when reading the video, the pts values of the buffers are very different.

Let me illustrate this with code:

from datetime import datetime, timezone

import gi

gi.require_version('Gst', '1.0')
from gi.repository import Gst


def pts_print_probe(pad: Gst.Pad, info: Gst.PadProbeInfo, udata) -> Gst.PadProbeReturn:
    buf: Gst.Buffer = info.get_buffer()
    dt_buf = datetime.fromtimestamp(buf.pts / Gst.SECOND, tz=timezone.utc)

    print(f'pts: {buf.pts}, {dt_buf}')

    return Gst.PadProbeReturn.OK


def drop_delta_probe(pad: Gst.Pad, info: Gst.PadProbeInfo, udata) -> Gst.PadProbeReturn:
    buf: Gst.Buffer = info.get_buffer()
    if buf.has_flags(Gst.BufferFlags.DELTA_UNIT):
        return Gst.PadProbeReturn.DROP

    return Gst.PadProbeReturn.OK


def main():
    Gst.init(None)

    pipeline: Gst.Pipeline = Gst.parse_launch(
        'filesrc location=fragment_1736513537115810865.mkv ! '
        'parsebin  name=parse ! '
        'nvv4l2decoder name=decode skip-frames=decode_key ! '
        'nvvideoconvert name=cvt ! '
        'nveglglessink sync=1'
    )

    pipeline.get_by_name('cvt').get_static_pad('sink').add_probe(Gst.PadProbeType.BUFFER, pts_print_probe, None)
    # pipeline.get_by_name('decode').get_static_pad('sink').add_probe(Gst.PadProbeType.BUFFER, drop_delta_probe, None)

    pipeline.set_state(Gst.State.PLAYING)

    bus = pipeline.get_bus()
    while True:
        message = bus.timed_pop_filtered(Gst.CLOCK_TIME_NONE, Gst.MessageType.EOS | Gst.MessageType.ERROR)
        if message:
            if message.type == Gst.MessageType.ERROR:
                print('Error:', message.parse_error())
            break

    pipeline.set_state(Gst.State.NULL)


if __name__ == "__main__":
    main()

Those are the PTS values:

pts: 1736513537116000000, 2025-01-10 12:52:17.116000+00:00
pts: 1736513546108000000, 2025-01-10 12:52:26.108000+00:00 << Huge gap here
pts: 1736513547066000000, 2025-01-10 12:52:27.066000+00:00
pts: 1736513548265000000, 2025-01-10 12:52:28.265000+00:00
pts: 1736513549386000000, 2025-01-10 12:52:29.386000+00:00
pts: 1736513550666000000, 2025-01-10 12:52:30.666000+00:00 << Almost without time delta here
pts: 1736513550706000000, 2025-01-10 12:52:30.706000+00:00
pts: 1736513550746000000, 2025-01-10 12:52:30.746000+00:00
pts: 1736513550786000000, 2025-01-10 12:52:30.786000+00:00
pts: 1736513550826000000, 2025-01-10 12:52:30.826000+00:00
pts: 1736513550866000000, 2025-01-10 12:52:30.866000+00:00
pts: 1736513550906000000, 2025-01-10 12:52:30.906000+00:00

However when we filter out non-key frames before decoder, the data look as I would expect. We can replicate by a tiny change in the script.

# Use decode_all here
'nvv4l2decoder name=decode skip-frames=decode_all ! '
...
# Uncomment this line
pipeline.get_by_name('decode').get_static_pad('sink').add_probe(Gst.PadProbeType.BUFFER, drop_delta_probe, None)

The output now is correct (i have also verified the values using a 3rd-party library: pyav):

pts: 1736513537116000000, 2025-01-10 12:52:17.116000+00:00
pts: 1736513538315000000, 2025-01-10 12:52:18.315000+00:00
pts: 1736513539515000000, 2025-01-10 12:52:19.515000+00:00
pts: 1736513540715000000, 2025-01-10 12:52:20.715000+00:00
pts: 1736513541915000000, 2025-01-10 12:52:21.915000+00:00
pts: 1736513543113000000, 2025-01-10 12:52:23.113000+00:00
pts: 1736513543753000000, 2025-01-10 12:52:23.753000+00:00
pts: 1736513544951000000, 2025-01-10 12:52:24.951000+00:00
pts: 1736513546148000000, 2025-01-10 12:52:26.148000+00:00
pts: 1736513547346000000, 2025-01-10 12:52:27.346000+00:00
pts: 1736513548546000000, 2025-01-10 12:52:28.546000+00:00
pts: 1736513549746000000, 2025-01-10 12:52:29.746000+00:00

How is this possible? Is this functionality correct and expected, or is it possible a bug in the timestamps of the frames?

Here is the video file used in the example: fragment_1736513537115810865.mp4 - Google Drive

Thank you for your time reading my question.

It may be a bug about the timestamps of the frames. We’ll investigate this ASAP.

Hi @zetxy89 , this issue will be fixed in the next release and you can stay tuned for our updates. Thanks a lot.

1 Like

Hello, thank you very much for investigating and making a fix for this.

If anyone came across this as well, for now we can apply this, allowing only key frames into the pipeline. It will roughly simulate this parameter of the decoder.

    def _probe_impl(self, pad: Gst.Pad, info: Gst.PadProbeInfo, u_data) -> Gst.PadProbeReturn:
        buf: Gst.Buffer = info.get_buffer()
        if buf.has_flags(Gst.BufferFlags.DELTA_UNIT):
            return Gst.PadProbeReturn.DROP
        return Gst.PadProbeReturn.OK

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.