Gstreamer issue in NX: Nano is faster than NX!

Hi there,

I was using the following commands on NANO and it was working perfectly, then I switched to NX and run the exact same commands but surprisingly it works with lots of hiccups and short pauses in the video:
I am using
gstreamer (gst-launch-1.0) Version: 1.14.5,
Jetpack 4.4.1 (L4T 32.4.4)
CUDA:10.2.89
cuDNN: 8.0.0.180
TensorRT: 7.1.3.0

On both NX and NANO.

and I have tried with and without jetson_clocks on both boards with no difference in the performance. CPU and RAM are below 30% while running the commands.
My Test involves 4 scripts:

  1. TCP SERVERS: which get the frames from usb3.0 camera and provides them on multiple tcpserversinks for different types of clients:

SCRIPT:
gst-launch-1.0 v4l2src device=/dev/video0 ! video/x-raw,format=YUY2, width=1920, height=1080, framerate=60/1 ! tee name=t t. ! videorate max-rate=15 ! nvvidconv ! ‘video/x-raw(memory:NVMM),width=1920,height=1280’ ! omxh264enc bitrate=10000000 preset-level=2 ! ‘video/x-h264, stream-format=(string)byte-stream’ ! h264parse ! queue ! rtph264pay config-interval=1 pt=96 ! gdppay ! queue ! tcpserversink port=5002 sync=false async=false t. ! videorate max-rate=20 ! nvvidconv ! ‘video/x-raw(memory:NVMM),width=880,height=512’ ! omxh264enc bitrate=2000000 preset-level=2 ! ‘video/x-h264, stream-format=(string)byte-stream’ ! h264parse ! queue ! rtph264pay config-interval=1 pt=96 ! gdppay ! queue ! tcpserversink port=5001 sync=false async=false t. ! videorate max-rate=30 ! nvvidconv ! ‘video/x-raw(memory:NVMM),width=1920,height=1080’ ! omxh264enc bitrate=10000000 preset-level=2 ! ‘video/x-h264, stream-format=(string)byte-stream’ ! h264parse ! queue ! rtph264pay config-interval=1 pt=96 ! gdppay ! queue ! tcpserversink port=5000 sync=false async=false -e

OUTPUT:
Setting pipeline to PAUSED …
Pipeline is live and does not need PREROLL …
Setting pipeline to PLAYING …
New clock: GstSystemClock
Framerate set to : 15 at NvxVideoEncoderSetParameterNvMMLiteOpen : Block : BlockType = 4
===== NVMEDIA: NVENC =====
NvMMLiteBlockCreate : Block : BlockType = 4
H264: Profile = 66, Level = 40
Framerate set to : 20 at NvxVideoEncoderSetParameterNvMMLiteOpen : Block : BlockType = 4
===== NVMEDIA: NVENC =====
NvMMLiteBlockCreate : Block : BlockType = 4
H264: Profile = 66, Level = 40
Framerate set to : 30 at NvxVideoEncoderSetParameterNvMMLiteOpen : Block : BlockType = 4
===== NVMEDIA: NVENC =====
NvMMLiteBlockCreate : Block : BlockType = 4
H264: Profile = 66, Level = 40

  1. TCP CLIENTS: Here I just want to refer to those ones which are problematic (slow) on NX (as mentioned earlier, they work perfectly fine on NANO)

2.1. Client 1: Gets data from TCPSERVER, makes modifications, and sends them udpsink.
Script:
gst-launch-1.0 tcpclientsrc port=5002 ! gdpdepay ! rtph264depay ! queue ! h264parse ! omxh264dec ! ‘video/x-raw(memory:NVMM),width=1920,height=1280’ ! nvvidconv ! ‘video/x-raw(memory:NVMM),width=1920,height=1280’ ! nvvidconv ! ‘video/x-raw,width=1280,height=720’ ! textoverlay text=“TITLE” shaded-background=yes deltay=-20 deltax=-20 valignment=top halignment=left font-desc=“Sans, 10” ! nvvidconv ! ‘video/x-raw(memory:NVMM)’ ! omxh264enc bitrate=2000000 preset-level=0 ! ‘video/x-h264, stream-format=(string)byte-stream’ ! h264parse ! queue ! rtph264pay config-interval=1 pt=96 ! gdppay ! queue ! udpsink host=10.0.0.1 port=6000 buffer-size=100000000 sync=false async=false -e

Client 1 OUPUT:

Setting pipeline to PAUSED …
Pipeline is PREROLLED …
Setting pipeline to PLAYING …
New clock: GstSystemClock

(gst-launch-1.0:13179): GStreamer-CRITICAL **: 14:35:01.965: gst_caps_is_empty: assertion ‘GST_IS_CAPS (caps)’ failed

(gst-launch-1.0:13179): GStreamer-CRITICAL **: 14:35:01.966: gst_caps_truncate: assertion ‘GST_IS_CAPS (caps)’ failed

(gst-launch-1.0:13179): GStreamer-CRITICAL **: 14:35:01.966: gst_caps_fixate: assertion ‘GST_IS_CAPS (caps)’ failed

(gst-launch-1.0:13179): GStreamer-CRITICAL **: 14:35:01.966: gst_caps_get_structure: assertion ‘GST_IS_CAPS (caps)’ failed

(gst-launch-1.0:13179): GStreamer-CRITICAL **: 14:35:01.966: gst_structure_get_string: assertion ‘structure != NULL’ failed

(gst-launch-1.0:13179): GStreamer-CRITICAL **: 14:35:01.966: gst_mini_object_unref: assertion ‘mini_object != NULL’ failed
NvMMLiteOpen : Block : BlockType = 261
NVMEDIA: Reading vendor.tegra.display-size : status: 6
NvMMLiteBlockCreate : Block : BlockType = 261
Allocating new output: 1920x1280 (x 18), ThumbnailMode = 0
OPENMAX: HandleNewStreamFormat: 3605: Send OMX_EventPortSettingsChanged: nFrameWidth = 1920, nFrameHeight = 1280
NvMMLiteOpen : Block : BlockType = 4
===== NVMEDIA: NVENC =====
NvMMLiteBlockCreate : Block : BlockType = 4
H264: Profile = 66, Level = 40
reference in DPB was never decoded

NOTE1: In spite of those “CRITICAL” warnings, the pipeline works fine.
NOTE2: These errors happen both on NANO and NX.

Question1: Is it possible that this is actually causing some errors in the background that is contributing to the problem I described?
Question2: Why these warnings are happening? how can I modify my GStreamer pipeline to get rid of them?

Client 2.1.1 : (In remote windows OS) : gets data from udpsrc and displays it on the monitor.
Script:
gst-launch-1.0 udpsrc port=6000 ! gdpdepay ! rtph264depay ! avdec_h264 ! video/x-raw ! autovideosink sync=false async=false

Output:
Setting pipeline to PAUSED …
Pipeline is live and does not need PREROLL …
Got context from element ‘autovideosink0’: gst.d3d11.device.handle=context, device=(GstD3D11Device)"(GstD3D11Device)\ d3d11device4", adapter=(int)0;
Pipeline is PREROLLED …
Setting pipeline to PLAYING …
New clock: GstSystemClock
Redistribute latency…
0:34:36.8 / 99:99:99.

Note: At this point I can see the video on the display (in remote windows OS) but with NANO, the video is much smoother than NX.

2.2. Client 2: Gets the data from TCPSERVER and saves it on the disk.
Script:
gst-launch-1.0 tcpclientsrc port=5000 ! gdpdepay ! rtph264depay ! queue ! h264parse ! mpegtsmux ! filesink location=/media/6FE3-EC2C/test.ts -e
Output:
Setting pipeline to PAUSED …
Pipeline is PREROLLING …
Pipeline is PREROLLED …
Setting pipeline to PLAYING …
New clock: GstSystemClock

Note: After Running This command, the file (test.ts) will be created on the disk and keeps recording the data. ( Again the recoded video in NANO is smoother)

2.3. Client3: Gets data from TCP Server, adds text overlays, then saves them on the disk.
SCRIPT:

gst-launch-1.0 tcpclientsrc port=5000 ! gdpdepay ! rtph264depay ! queue ! h264parse ! omxh264dec ! ‘video/x-raw(memory:NVMM),width=1920,height=1080’ ! nvvidconv ! ‘video/x-raw,width=1920,height=1080’ ! textoverlay text=“TITLE” valignment=top halignment=left font-desc=“Sans,15” ! textoverlay text=“TITLE2” valignment=bottom halignment=left deltay=-45 font-desc=“Sans,10” shaded-background=yes ! nvvidconv ! ‘video/x-raw(memory:NVMM)’ ! omxh264enc bitrate=10000000 preset-level=2 ! ‘video/x-h264,stream-format=(string)byte-stream’ ! h264parse ! mpegtsmux ! filesink location=/media/6FE3-EC2C/test_overlay.ts -e

OUTPUT:

Setting pipeline to PAUSED …
Pipeline is PREROLLING …

(gst-launch-1.0:3256): GStreamer-CRITICAL **: 17:27:21.311: gst_caps_is_empty: assertion ‘GST_IS_CAPS (caps)’ failed

(gst-launch-1.0:3256): GStreamer-CRITICAL **: 17:27:21.311: gst_caps_truncate: assertion ‘GST_IS_CAPS (caps)’ failed

(gst-launch-1.0:3256): GStreamer-CRITICAL **: 17:27:21.311: gst_caps_fixate: assertion ‘GST_IS_CAPS (caps)’ failed

(gst-launch-1.0:3256): GStreamer-CRITICAL **: 17:27:21.311: gst_caps_get_structure: assertion ‘GST_IS_CAPS (caps)’ failed

(gst-launch-1.0:3256): GStreamer-CRITICAL **: 17:27:21.311: gst_structure_get_string: assertion ‘structure != NULL’ failed

(gst-launch-1.0:3256): GStreamer-CRITICAL **: 17:27:21.312: gst_mini_object_unref: assertion ‘mini_object != NULL’ failed
NvMMLiteOpen : Block : BlockType = 261
NVMEDIA: Reading vendor.tegra.display-size : status: 6
NvMMLiteBlockCreate : Block : BlockType = 261
Allocating new output: 1920x1088 (x 11), ThumbnailMode = 0
OPENMAX: HandleNewStreamFormat: 3605: Send OMX_EventPortSettingsChanged: nFrameWidth = 1920, nFrameHeight = 1080
NvMMLiteOpen : Block : BlockType = 4
===== NVMEDIA: NVENC =====
NvMMLiteBlockCreate : Block : BlockType = 4
H264: Profile = 66, Level = 40
reference in DPB was never decoded
Pipeline is PREROLLED …
Setting pipeline to PLAYING …
New clock: GstSystemClock

NOTE1: I still get those “CRITICAL” warnings but again the pipeline works and creates the video files.
NOTE2: As you may have noticed, I am writing to a USB flash drive, which is USB3, but due to the bandwidth of the USB3 and the amount of inputs (from camera) and outputs (to files) I don’t think this would be an issue.
NOTE3: It is the exact same setup on NANO as well, but again NANO performs much better.

Question3: How can I get rid of these “CRITICAL” warnings?

ALL clients (except for the one that I mentioned which runs on windows) and the server script are running locally on NX (or NANO) at the same time.

I truly appreciate if someone can tell me what might be wrong or what I can do to get at least the same performance on NX if not better, plus, it would be great if you can answer questions 1-3 about the CRITICAL messages that I get after running the scripts.

Thank you!

Not sure, but you may try;

  1. TCP servers
  • Adding queue in front of any subpipeline after tee in tcp server.
  • If your camera can provide lower framerate than 60 fps, try lowering and if it can do 15 fps try removing videorate.
  • Try mkv container rather rtph264 that relies on UDP packetization if you use gdp and TCP.
  1. TCP clients:
    If using RTP over UDP, you may remove gdp

Hi,
Please execute sudo nvpmodel -m 2, sudo jetson_clocks and try again. All power modes are listed in

You can execute sudo tegrastats to check CPU/GPU clocks.

Thank you DaneLLL for your response. I executed those commands, but the result is the same.

Thank you for your response.

  • I added queue in front of the sub-pipelines.
  • My camera unfortunately only has 60 FPS output.
  • I have not worked with mkv container, can you send me an example ?

Hi,
Please use nvv4l2h264enc and set maxperf-enable=1

  maxperf-enable      : Enable or Disable Max Performance mode
                        flags: readable, writable, changeable only in NULL or READY state
                        Boolean. Default: false

This runs encoder in maximum clock. Should bring max encoder throughput.

Dear DaneLLL,

Thank you for your suggestions. Here is what I did >>

So for the next tests, I set the power mode to 2 (15W 6Core) using sudo nvpmodel -m 2, and ran sudo jetson_clocks, then ran the following Gstream commands.

Test 1:
Running the scripts as is mentioned in my first post.

Output of tegrastats:
*** sudo tegrastats >> To check the load of CPU/GPU/…
Results:

RAM 2409/7764MB (lfb 873x4MB) SWAP 0/3882MB (cached 0MB) CPU [26%@1420,10%@1420,19%@1420,21%@1420,12%@1420,12%@1420] EMC_FREQ 17%@1600 GR3D_FREQ 0%@1109 NVENC 358 NVDEC 192 NVDEC1 192 APE 150 MTS fg 0% bg 5% AO@38C GPU@38.5C PMIC@100C AUX@38C CPU@39C thermal@38.45C VDD_IN 6103/6103 VDD_CPU_GPU_CV 1474/1474 VDD_SOC 2167/2167

RAM 2410/7764MB (lfb 871x4MB) SWAP 0/3882MB (cached 0MB) CPU [26%@1420,12%@1420,19%@1420,21%@1420,14%@1420,16%@1420] EMC_FREQ 17%@1600 GR3D_FREQ 0%@1109 NVENC 358 NVDEC 192 NVDEC1 192 APE 150 MTS fg 0% bg 5% AO@38C GPU@38.5C PMIC@100C AUX@38C CPU@39C thermal@38.3C VDD_IN 6184/6143 VDD_CPU_GPU_CV 1515/1494 VDD_SOC 2249/2208

Qualitative output:
As mentioned earlier, lots of hiccups and pauses in the video

Test 2:

Replaced my h264 decoders to: nvv4l2decoder enable-max-performance=1

Output of tegrastats:

RAM 2598/7764MB (lfb 687x4MB) SWAP 0/3882MB (cached 0MB) CPU [34%@1420,16%@1420,27%@1420,24%@1420,24%@1420,17%@1420] EMC_FREQ 21%@1600 GR3D_FREQ 0%@1109 NVENC 294 NVDEC 665 NVDEC1 665 APE 150 MTS fg 0% bg 6% AO@38C GPU@39C PMIC@100C AUX@38.5C CPU@39.5C thermal@38.6C VDD_IN 6635/6635 VDD_CPU_GPU_CV 1679/1679 VDD_SOC 2453/2453

RAM 2599/7764MB (lfb 685x4MB) SWAP 0/3882MB (cached 0MB) CPU [31%@1420,13%@1420,15%@1420,18%@1420,13%@1420,11%@1420] EMC_FREQ 19%@1600 GR3D_FREQ 0%@1109 NVENC 307 NVDEC 665 NVDEC1 665 APE 150 MTS fg 0% bg 5% AO@38C GPU@39C PMIC@100C AUX@38C CPU@39C thermal@38.6C VDD_IN 6144/6389 VDD_CPU_GPU_CV 1433/1556 VDD_SOC 2249/2351

Qualitative output:

A little bit improved from Test 1

Test 3:

Replaced my h264 encoder to : nvv4l2h264enc maxperf-enable=1
and
Replaced my h264 decoder to : nvv4l2decoder enable-max-performance=1

Output of tegrastats:

RAM 2819/7764MB (lfb 531x4MB) SWAP 0/3882MB (cached 0MB) CPU [33%@1420,14%@1420,22%@1420,17%@1420,12%@1420,17%@1420] EMC_FREQ 18%@1600 GR3D_FREQ 0%@1109 NVENC 499 NVENC1 499 NVDEC 665 NVDEC1 665 APE 150 MTS fg 0% bg 5% AO@44C GPU@45C PMIC@100C AUX@44.5C CPU@45.5C thermal@44.8C VDD_IN 6184/6184 VDD_CPU_GPU_CV 1597/1597 VDD_SOC 2167/2167

RAM 2820/7764MB (lfb 531x4MB) SWAP 0/3882MB (cached 0MB) CPU [31%@1420,15%@1420,21%@1420,15%@1420,7%@1420,13%@1420] EMC_FREQ 18%@1600 GR3D_FREQ 0%@1109 NVENC 499 NVENC1 499 NVDEC 665 NVDEC1 665 APE 150 MTS fg 0% bg 5% AO@44C GPU@45C PMIC@100C AUX@44.5C CPU@45C thermal@44.95C VDD_IN 6103/6143 VDD_CPU_GPU_CV 1556/1576 VDD_SOC 2126/2146

Qualitative output:

The quality of the recorded videos is significantly improved from test 1 but still not as ideal as NANO!

Also, if it helps, Below is the Gstream output after running the modified Client 2.3 in the Test 3:
Setting pipeline to PAUSED …
Opening in BLOCKING MODE
Opening in BLOCKING MODE
Pipeline is PREROLLING …
NvMMLiteOpen : Block : BlockType = 261
NVMEDIA: Reading vendor.tegra.display-size : status: 6
NvMMLiteBlockCreate : Block : BlockType = 261

(gst-launch-1.0:6141): GStreamer-CRITICAL : 18:56:26.917: gst_mini_object_unref: assertion ‘mini_object != NULL’ failed
NvMMLiteOpen : Block : BlockType = 261
NVMEDIA: Reading vendor.tegra.display-size : status: 6
NvMMLiteBlockCreate : Block : BlockType = 261
Redistribute latency…
NvMMLiteOpen : Block : BlockType = 4
===== NVMEDIA: NVENC =====
NvMMLiteBlockCreate : Block : BlockType = 4
H264: Profile = 66, Level = 0
Pipeline is PREROLLED …
Setting pipeline to PLAYING …
New clock: GstSystemClock

Could it be a hardware related or a driver-based issue?

Hi,
The tegrastats shows encoder and decoder are at max frequencies, so it should not be an issue in hardware encoding/decoding. Please break down the pipeline with fpsdisplaysink text-overlay=0 video-sink=faksink -v. In breaking down the pieline and checking framerate, we should be able to know which plugin slows down the pipeline.

Hi,

So I applied fpsdisplaysink to the streaming pipeline (on test 3 with nvv4l2h264enc and nvv4l2decoder ). Strangely enough, it shows exactly 15 Avg FPS with 0 dropped frames (** on client 1 which is set to be 15 FPS**) . But still has those hiccups. I attached the video to show (with fpsdisplay text-overlay) what exactly I mean by hiccups.