omxh264enc slow and dropping the first several frames on Xavier (GStreamer)

I’ve been investigating issues with omxh264enc on Xavier (never had a problem on TX2). I will post more details soon, but one of the things I see (100%) is that when running a pipeline, a bunch of frames at the beginning of the video get dropped - they never make it to to the pipeline sink.

This is true even though I have no leaky queues and all frames are set to GST_BUFFER_FLAG_NON_DROPPABLE. Is this because omxh264enc takes some time to init and drops some frames during that time? Is it expected?

FWIW, I think this is related to this issue ( https://devtalk.nvidia.com/default/topic/1047670/video-encoder-performance-when-using-appsink-omxh264enc- ) which although it’s marked a answered is actually still quite mysterious (and 100% reproducible for me on all Xavier hardware but on no TX2 hardware).

As promised, attached is a small test app to repro the problem. When I run the app on a Xavier, I get output such as the below. Notice:

  • Several (~30) frames get fed into the appsrc before the sink starts receiving frames.
  • More importantly, those 30 frames (aside from the first) are lost! After that frames seem to get processed correctly.
  • Lest you think this is a timing issue, if you feed only one frame per second (instead of one every 10ms) you still have the same issue: It'll take ~30 seconds before the sink starts seeing frames!
  • Running the same app on a TX2 yields none of these problems.

Sample output:

Src:  0
Src:  18068838
Src:  29706809
Src:  41311210
Src:  52733491
Src:  64219584
Src:  75699019
Src:  87189816
Src:  98730630
Src:  110334520
Src:  121873382
Src:  133430709
Src:  144996549
Src:  156485777
Src:  168040064
Src:  179571022
Src:  191120701
Src:  202675436
Src:  214269949
Src:  225902832
Src:  237513889
Src:  249089329
Src:  260695522
Src:  272250865
Src:  283777983
Src:  295326286
Src:  306851036
Src:  318402347
Src:  329991900
Src:  341589485
Src:  353221215
Sink: 0
Sink: 0
Sink: 353221215
Sink: 353221215
Sink: 353221215
Src:  364796111
Sink: 364796111
...

gsttest-omxh264enc.tar.gz (231 KB)

Also, just to be clear, I believe that this is just one symptom of an underlying problem with the encoder. The encoded video files are all corrupted in some subtle way that I’m still trying to figure out (causing other big problems in our system), but I assume that these issues all have the same cause.

Hi,
On r32.1, please try nvv4l2h264enc plugin.

Thank you for the response, I will certainly do so.

Can you please explain the difference between nvv4l2h264enc and omxh264enc? I know about v4l and the omx standard, but I don’t see why this nessecitates two different encoders… Are one of these more appropriate in certain pipelines?

Hi,
For running omxh264enc at max frequency, you need to apply
https://devtalk.nvidia.com/default/topic/1032771/jetson-tx2/no-encoder-perfomance-improvement-before-after-jetson_clocks-sh/post/5255605/#5255605

nvv4l2h264enc does not need the patch. It runs at max frequency always.

Thanks again Dane.

I still need to perform further testing, but it looks like nvv4l2h264enc fixes my issue. (I will accept as answer after I complete my tests.)

However, there are still mysteries:

  • Why does running at "max frequency" explain the issue I'm talking about? It's true that performance is one problem, but I think the encoder is simply not behaving in a valid manner, regardless of performance... I think there is some bug here.
  • Aside from the performance (and the lack of the bug) what is the difference between nvv4l2h264enc and omxh264enc? All other things being equal, should Jetson apps switch to the v4l2 encoders/decoder?

Hi,
The two plugins are in different SW stacks but the same hardware engine:

omxh264enc - gstomx(OMX IL) - HW engine NVENC
nvv4l2h264enc - gst-nvvideo4linux2(v4l2) - HW engine NVENC

gstomx and gst-nvvideo4linux2 are open source
https://developer.nvidia.com/embedded/dlc/l4t-sources-32-1-JAX-TX2

The version of gstreamer is at 1.14.1 on r32 and at 1.8.3 on r28. Maybe the different versions make the difference.

In the long run, we will switch to use v4l2 for unifying interfaces of desktop GPUs and Jetson platforms(TX1/TX2/Xavier/Nano).