Nvv4l2decoder not decoding every frames individually


I am running one experimental Gstreamer application, where I am running one pipeline which looks like
*filesrc location=stopwatch_1080p30.mp4 ! qtdemux ! h264parse name=parser config-interval=-1 ! *
nvv4l2decoder name=decoder ! fakesink name=fsink
stopwatch_1080p30_log.txt (139.6 KB)

pipeline.cpp (4.5 KB)
parking_record_log.txt (237.6 KB)

Below is my experiment environment.
Ubuntu 18.04
Deepstream 5.0
Gstreamer version 1.14.5
Tegra: R32
Jetpack version: 4.5.1-b17
CUDA version: 10.2.89

I have 2 mp4 videos : 1) parking_record.mp4 , 2) stopwatch_1080p30.mp4

Whenever I run parking_record.mp4 video in application, I saw that first nvv4l2decoer will store around 16-17 frames and then it will decode them and push them.
But when I run stopwatch_1080p30.mp4 in application, nvv4l2decoder will get each frame, decoder it and push it individually.
I have attached logs for both the experiments for the reference.
I have also attached both mp4 videos and gstreamer application code (pipeline.cpp).
(Change filesrc location accordingly to see the behavior)

To compile code, please run below command.
$ g++ -Wall pipeline.cpp -o gstApp $(pkg-config --cflags --libs gstreamer-1.0)

Can you tell me why this kind of behaviour is there with parking_record.mp4 ?
And what needs to be changed in the nvv4l2decoder property/code (if needed) to decode every frame individually for parking_record.mp4 ?

It is not decoder but the whole pipeline decide when to decode one frame. You can set “sync=false” property with fakesink to make the pipeline handle the frames ASAP.

Hi @Fiona.Chen

Thanks for your reply.
Fakesink ‘sync’ is false by-default.
So in my pipeline, fakesink ‘sync’ property is false only.

park.txt (8.4 MB)
The decoder does decode the frame individually. The reason you can not get the frame data in sink pad is because the sink does not consume the frame at that time. Gstreamer work in asynchronized way. The data stay in upstream conmponent until the downstream consumes the data. You can check the log of park.txt, the sink received segment event after the decoder decode 18 frames. The segment event is from the qtdemux which is upstream component.
So the information in mp4 format decided when the sink will consume the output from upstream.
I’ve tested the raw H264 data inside parking_record.mp4, the frame is decoded and be consumed by sink immediately.
Please refer to MP4 format spec ISO - ISO/IEC 14496-12:2020 - Information technology — Coding of audio-visual objects — Part 12: ISO base media file format and qtdemux source code gst/isomp4/qtdemux.c · master · GStreamer / gst-plugins-good · GitLab to understand why there is delay of segment event to sink.

Only test the parking video case.
park_0.txt (11.0 MB)

The log park_0.txt is for the raw H264 data pipeline, it shows the segment event is received from upstream after two frames are decoded.
The pipeline for raw H264 data:
gst-launch-1.0 --gst-debug=basesink:7,h264parse:5,v4l2videodec:7 filesrc location=parking.264 ! h264parse name=parser config-interval=-1 ! nvv4l2decoder name=decoder ! fakesink name=fsink
It has nothing to do with deepstream. It is just some basic multimedia concept and spec understanding. You can investigate by yourself or consult gstreamer community.

Hi @Fiona.Chen

Thanks for your reply.

I checked the park_0.txt log.
I saw that fakesink received segment event after frame 17.
And decoder is decoding frames in bulk (16-17 frames) only.
I ran the following the command on the park_0.txt log
$ cat park_0.txt | grep -e “Allocate output buffer” -e “Process output buffer” -e "Handling frame"
I have attached the output of above command.
park_0_short.txt (1.1 MB)
Here you can see that, it will get 16 frames first and then it will decode all of them.

I created h264 video from parking_record.mp4 via below command.
$ ffmpeg -i parking_record.mp4 -c:v libx264 parking.h264
I have attached resulted h264 file.
parking.h264 (1.7 MB)
I ran this file with the pipeline you sent.
gst-launch-1.0 --gst-debug=basesink:7,h264parse:5,v4l2videodec:7 filesrc location=parking.h264 ! h264parse name=parser config-interval=-1 ! nvv4l2decoder name=decoder ! fakesink name=fsink
Here also, it is not decoding individual frames.

You can get the 16 frames in decoder pool but not in src pad

Hi @Fiona.Chen

Thanks for your reply.
I would like to know why decoder pool is storing 16 frames for this particular stream.
This behaviour is not observed, if we use avdec_h264 instead of nvv4l2decoder.
As I mentioned earlier, with other streams, this behaviour is not observed.