TX2 h264 encoding, 60 FPS camera, 30 FPS Video Stream

We’re working on evaluating video quality and performance with a TX2
encoding 3 1080p/60 video streams (SDI converted to CSI). In general
things work, but when testing with scenes with rapid motion (highway
traffic) the video isn’t smooth.

I fixed one frame rate problem, however in investigating deeper I found
that while the capture is 60 fps, the encoded video stream seems to be
limited to 30fps. I can capture (v4l2src ! fakesink num-buffers=600) and
it runs for 10 seconds as expected. However if I capture, decode, and
split the video into individual frames, looking at the motion of a sweep-second
hand on a clock I can see that there are only 30 frames per second.

Is there a way I can see where the frames are being dropped?

The full gstreamer pipeline is basically…

v4l2src device=/dev/video5 do-timestamp=true 
! capsfilter caps=video/x-raw,width=1920,height=1080,framerate=60/1 
! nvvidconv ! capsfilter caps=video/x-raw(memory:NVMM)
! omxh264enc insert-sps-pps=true control-rate=1  profile=high
    bitrate=6000000 iframeinterval=120
! capsfilter caps=video/x-h264,stream-format=(string)byte-stream
! queue ! h264parse config-interval=5  
! queue ! mpegtsmux name=mux pat-interval=10000 pmt-interval=10000 alignment=7  
!  udpsink ttl=16 ttl-mc=8 ttl-mc=8 send-duplicates=false 
   auto-multicast=false   sync=false

Please break down the pipeline to check the framerate:

$ gst-launch-1.0 v4l2src device=/dev/video1 ! video/x-raw,width=1920,height=1080,format=UYVY,framerate=60/1 ! fpsdisplaysink video-sink=fakesink -v
$ gst-launch-1.0 v4l2src device=/dev/video1 ! video/x-raw,width=1920,height=1080,format=UYVY,framerate=60/1 ! nvvidconv ! 'video/x-raw(memory:NVMM),format=NV12' ! fpsdisplaysink video-sink=fakesink -v
$ gst-launch-1.0 v4l2src device=/dev/video1 ! video/x-raw,width=1920,height=1080,format=UYVY,framerate=60/1 ! nvvidconv ! 'video/x-raw(memory:NVMM),format=NV12' ! omxh264enc ! fpsdisplaysink video-sink=fakesink -v

If no other encode streams are running …

Pipeline 1: rendered: 358, dropped: 0, current: 58.11, average: 58.56
Pipeline 2: rendered: 429, dropped: 0, current: 60.00, average: 60.01
Pipeline 3: rendered: 576, dropped: 0, current: 59.96, average: 60.31

If two other streams (1080p60 h264 transport stream tx) are running …

Pipeline 1: rendered: 608, dropped: 0, current: 55.98, average: 56.84
Pipeline 2: rendered: 312, dropped: 0, current: 40.05, average: 40.03
Pipeline 3: rendered: 382, dropped: 0, current: 40.34, average: 41.32

This is consistent with the video quality – it is smoother if only
one stream is running.

Our requirement is for 3x 1080p60 (or 4x 720p60) streams.

Any insight would be appreciated.

Thanks in Advance,

Cary

Additional Observations:

It seems the frame rate in the third test (with the h264 encoder) starts
out at 60 fps but then drops over time to a lower rate. It also seems
this rate depends on the resolution of other video streams.

With 3x 1080p-60, it takes about 5 seconds to drop down to 46 fps, and
continues to drop slowly (over several minutes) down to about 30 fps.

If I stop the other two streams, the frame rate seems to very slowly
increase.

With 1x 1080p-60 (measured) and 2x 720p-60, it will take a minute or
two to drop from the initial 60 fps down to 48fps (average frame rate).
This frame rate too seems to continue to drop, slowly, over time.

REVISION:

The above observations are for the average frame rate, which seems to
be calculated over a large number (all?) the frames. Looking at the
instantaneous frame rate the change is much faster.

1 X 1080p-60 streams 60 current fps on the measured pipeline
2 X 1080p-60 streams 40 current fps on the measured pipeline
3 X 1080p-60 streams 30 current fps on the measured pipeline

Hi,
Do you rebuild libgstomx.so to run at max clocks as suggest at
https://devtalk.nvidia.com/default/topic/1046826/jetson-tx2/encode-bogging-down-when-encoding-multiple-streams-of-different-resolutions-/post/5312422/#5312422

I was able to download and rebuild the code in gstomx1_src/gst-omx1 on
the target machine.

I downloaded

https://developer.download.nvidia.com/embedded/L4T/r28_Release_v1.0/BSP/source_release.tbz2

The unmodified library operated correctly.

I tried to apply the patch, but it didn’t apply since the line

oEncodeProp.bInsertVUI = self->insert_vui;

Wasn’t present, so I added the line setting bSetMaxEncClock = TRUE by hand.

if (TRUE) {
    GST_OMX_INIT_STRUCT (&oEncodeProp);
    oEncodeProp.nPortIndex = enc->enc_out_port->index;

    eError = gst_omx_component_get_index (GST_OMX_VIDEO_ENC (self)->enc,
        (gpointer) NVX_INDEX_PARAM_VIDEO_ENCODE_PROPERTY, &eIndex);

    if (eError == OMX_ErrorNone) {
      eError =
          gst_omx_component_get_parameter (GST_OMX_VIDEO_ENC (self)->enc,
          eIndex, &oEncodeProp);
      if (eError == OMX_ErrorNone) {
        oEncodeProp.bInsertSPSPPSAtIDR = self->insert_sps_pps;
        oEncodeProp.bInsertAUD = self->insert_aud;
        oEncodeProp.bSetMaxEncClock = TRUE;

        eError =
            gst_omx_component_set_parameter (GST_OMX_VIDEO_ENC (self)->enc,
            eIndex, &oEncodeProp);
      }
    }

Running with this library, I am still only seeing about 40 fps encoded when three
1080p60 streams are running.

rendered: 981, dropped: 0, current: 39.98, average: 40.58
rendered: 1001, dropped: 0, current: 39.74, average: 40.57
rendered: 1021, dropped: 0, current: 39.92, average: 40.55
rendered: 1042, dropped: 0, current: 40.27, average: 40.55
rendered: 1063, dropped: 0, current: 40.12, average: 40.54

I verified that the patched libgstomx.so was being
used with lsof of the running process and md5sum.

Any other suggestions?

Thanks, Cary

Any suggestions on what steps to try next?

Thanks

Hi,
Please run nvpmodel and jetson_clocks.sh:
https://devtalk.nvidia.com/default/topic/1030506/jetson-tx2/nvpmodel-and-jetson_clocks/post/5242231/#5242231

And check tegrastats to ensure encoder clock is always at max

I attempted to increase the gpu clock to the maximum.

Frame rate before change (running 3x 1080p60 encode streams).

rendered: 271, dropped: 0, current: 39.92, average: 40.58

nvpmodel output

# nvpmodel -q --verbose
NVPM VERB: parsing done for /etc/nvpmodel.conf
NVPM VERB: Current mode: NV Power Mode: MAXP_CORE_ARM
3
NVPM VERB: PARAM CPU_ONLINE: ARG CORE_1: PATH /sys/devices/system/cpu/cpu1/online: REAL_VAL: 1 CONF_VAL: 0
NVPM VERB: PARAM CPU_ONLINE: ARG CORE_2: PATH /sys/devices/system/cpu/cpu2/online: REAL_VAL: 1 CONF_VAL: 0
NVPM VERB: PARAM CPU_ONLINE: ARG CORE_3: PATH /sys/devices/system/cpu/cpu3/online: REAL_VAL: 1 CONF_VAL: 1
NVPM VERB: PARAM CPU_ONLINE: ARG CORE_4: PATH /sys/devices/system/cpu/cpu4/online: REAL_VAL: 1 CONF_VAL: 1
NVPM VERB: PARAM CPU_ONLINE: ARG CORE_5: PATH /sys/devices/system/cpu/cpu5/online: REAL_VAL: 1 CONF_VAL: 1
NVPM VERB: PARAM CPU_A57: ARG MIN_FREQ: PATH /sys/devices/system/cpu/cpu0/cpufreq/scaling_min_freq: REAL_VAL: 2035200 CONF_VAL: 0
NVPM VERB: PARAM CPU_A57: ARG MAX_FREQ: PATH /sys/devices/system/cpu/cpu0/cpufreq/scaling_max_freq: REAL_VAL: 2035200 CONF_VAL: 2000000
NVPM VERB: PARAM GPU: ARG MIN_FREQ: PATH /sys/devices/17000000.gp10b/devfreq/17000000.gp10b/min_freq: REAL_VAL: 114750000 CONF_VAL: 0
NVPM VERB: PARAM GPU: ARG MAX_FREQ: PATH /sys/devices/17000000.gp10b/devfreq/17000000.gp10b/max_freq: REAL_VAL: 1134750000 CONF_VAL: 1120000000
NVPM VERB: PARAM EMC: ARG MAX_FREQ: PATH /sys/kernel/nvpmodel_emc_cap/emc_iso_cap: REAL_VAL: 1600000000 CONF_VAL: 1600000000

Attempt to increase gpu frequency.

# ls /sys/devices/17000000.gp10b/devfreq/17000000.gp10b/
available_frequencies  available_governors  cur_freq  device  governor  max_freq  min_freq  polling_interval  power  subsystem  target_freq  trans_stat  uevent
# ls /sys/devices/17000000.gp10b/devfreq/17000000.gp10b/available_frequencies 
/sys/devices/17000000.gp10b/devfreq/17000000.gp10b/available_frequencies
# cat /sys/devices/17000000.gp10b/devfreq/17000000.gp10b/available_frequencies                                                                                                                           
114750000 216750000 318750000 420750000 522750000 624750000 726750000 828750000 930750000 1032750000 1134750000 1236750000 1300500000
# cat /sys/devices/17000000.gp10b/devfreq/17000000.gp10b/max_freq
1134750000
# echo 1300500000 > /sys/devices/17000000.gp10b/devfreq/17000000.gp10b/max_freq
# cat /sys/devices/17000000.gp10b/devfreq/17000000.gp10b/max_freq
1300500000

Re-test frame rate.

rendered: 527, dropped: 0, current: 39.99, average: 48.75
rendered: 548, dropped: 0, current: 39.98, average: 48.35
rendered: 569, dropped: 0, current: 40.02, average: 47.98
rendered: 590, dropped: 0, current: 39.98, average: 47.64

Increasing this speed didn’t seem to help the frame rate.

Are there any other settings I need to look at?

Thanks in Advance,

Cary

Hi,
Please share tegrastats for reference.

Hi,
I guess you miss the modification:

@@ -392,9 +393,7 @@ gst_omx_h264_enc_set_format (GstOMXVideoEnc * enc, GstOMXPort * port,
   }
 
 
-  if (self->insert_sps_pps ||
-      self->insert_aud ||
-      self->insert_vui) {
+  if (TRUE) {
     err = gst_omx_h264_enc_set_params (enc);
     if (err != OMX_ErrorNone) {
       GST_WARNING_OBJECT (self,

If you apply the patch correctly, you will see MSENC always keeping at 1113(MHz) in tegrastats:

RAM 1022/7846MB (lfb 1545x4MB) CPU [8%@2032,off,off,100%@2032,15%@2032,17%@2032] EMC_FREQ 9%@1600 GR3D_FREQ 0%@1122 <b>MSENC 1113</b> APE 150 BCPU@35C MCPU@35C GPU@32C PLL@35C Tboard@28C Tdiode@30C PMIC@100C thermal@34C VDD_IN 6087/6122 VDD_CPU 1377/1377 VDD_GPU 153/159 VDD_SOC 1836/1856 VDD_WIFI 0/1 VDD_DDR 1155/1158

We again verify it by launching below pipeline in three consoles simultaneously.

$ gst-launch-1.0 videotestsrc is-live=1 ! video/x-raw,framerate=60/1 ! nvvidconv ! 'video/x-raw(memory:NVMM),width=1920,height=1080,format=NV12' ! omxh264enc ! fpsdisplaysink video-sink=fakesink -v

Tegra stats output…

# ./tegrastats 
RAM 558/7851MB (lfb 1740x4MB) cpu [0%@2034,0%@2034,0%@2034,0%@2034,0%@2036,0%@2036] EMC 33%@1600 APE 150 MSENC 1113 GR3D 0%@114
RAM 558/7851MB (lfb 1740x4MB) cpu [24%@2034,0%@2034,0%@2034,51%@2031,46%@2034,25%@2034] EMC 33%@1600 APE 150 MSENC 1113 GR3D 0%@114
RAM 558/7851MB (lfb 1740x4MB) cpu [24%@2036,0%@2035,0%@2034,46%@2036,48%@2035,23%@2036] EMC 33%@1600 APE 150 MSENC 1113 GR3D 0%@114
RAM 558/7851MB (lfb 1740x4MB) cpu [43%@2036,0%@2035,0%@2036,52%@2034,47%@2035,4%@2034] EMC 33%@1600 APE 150 MSENC 1113 GR3D 0%@114
RAM 559/7851MB (lfb 1740x4MB) cpu [14%@2032,0%@2035,0%@2036,52%@2033,47%@2034,34%@2034] EMC 33%@1600 APE 150 MSENC 1113 GR3D 0%@114
RAM 558/7851MB (lfb 1740x4MB) cpu [23%@2034,0%@2035,0%@2035,49%@2036,50%@2033,22%@2036] EMC 33%@1600 APE 150 MSENC 1113 GR3D 0%@114
RAM 559/7851MB (lfb 1740x4MB) cpu [41%@2035,0%@2035,0%@2034,49%@2035,46%@2036,10%@2034] EMC 33%@1600 APE 150 MSENC 1113 GR3D 0%@114
RAM 559/7851MB (lfb 1740x4MB) cpu [17%@2034,0%@2034,0%@2034,50%@2035,47%@2033,31%@2034] EMC 33%@1600 APE 150 MSENC 1113 GR3D 0%@114
RAM 559/7851MB (lfb 1740x4MB) cpu [12%@2036,0%@2034,0%@2035,49%@2034,46%@2035,34%@2035] EMC 33%@1600 APE 150 MSENC 1113 GR3D 0%@114
RAM 559/7851MB (lfb 1740x4MB) cpu [36%@1993,0%@2035,0%@2035,48%@1997,49%@1994,14%@1997] EMC 33%@1600 APE 150 MSENC 1113 GR3D 0%@114
RAM 559/7851MB (lfb 1740x4MB) cpu [31%@2034,0%@2034,0%@2035,50%@2034,46%@2035,13%@2035] EMC 33%@1600 APE 150 MSENC 1113 GR3D 0%@114
RAM 558/7851MB (lfb 1740x4MB) cpu [33%@2034,0%@2034,0%@2035,49%@2035,48%@2035,16%@2036] EMC 33%@1600 APE 150 MSENC 1113 GR3D 0%@114
RAM 558/7851MB (lfb 1740x4MB) cpu [43%@2035,0%@2034,0%@2034,48%@2035,45%@2035,4%@2036] EMC 33%@1600 APE 150 MSENC 1113 GR3D 0%@114
RAM 559/7851MB (lfb 1740x4MB) cpu [9%@2033,0%@2035,0%@2035,50%@2036,47%@2035,39%@2035] EMC 33%@1600 APE 150 MSENC 1113 GR3D 0%@114
RAM 559/7851MB (lfb 1740x4MB) cpu [5%@2032,13%@2035,0%@2035,51%@2032,47%@2033,33%@2034] EMC 33%@1600 APE 150 MSENC 1113 GR3D 0%@114
RAM 559/7851MB (lfb 1740x4MB) cpu [6%@2035,95%@2035,0%@2034,48%@2034,48%@2035,6%@2036] EMC 33%@1600 APE 150 MSENC 1113 GR3D 0%@114
RAM 559/7851MB (lfb 1740x4MB) cpu [4%@2033,94%@2035,0%@2036,48%@2034,46%@2034,3%@2036] EMC 32%@1600 APE 150 MSENC 1113 GR3D 0%@114
RAM 559/7851MB (lfb 1740x4MB) cpu [4%@2033,94%@2035,0%@2034,48%@2034,49%@2034,5%@2034] EMC 32%@1600 APE 150 MSENC 1113 GR3D 0%@114
RAM 559/7851MB (lfb 1740x4MB) cpu [6%@2032,95%@2036,0%@2035,51%@2036,46%@2035,1%@2036] EMC 31%@1600 APE 150 MSENC 1113 GR3D 0%@114
RAM 559/7851MB (lfb 1740x4MB) cpu [6%@2033,94%@2035,0%@2035,51%@2035,46%@2033,5%@2034] EMC 31%@1600 APE 150 MSENC 1113 GR3D 0%@114
RAM 559/7851MB (lfb 1740x4MB) cpu [6%@2035,52%@2035,42%@2036,51%@2034,49%@2034,6%@2034] EMC 31%@1600 APE 150 MSENC 1113 GR3D 0%@114
RAM 559/7851MB (lfb 1740x4MB) cpu [8%@2036,0%@2035,94%@2034,55%@2036,50%@2036,6%@2034] EMC 31%@1600 APE 150 MSENC 1113 GR3D 0%@114
RAM 559/7851MB (lfb 1740x4MB) cpu [7%@2033,0%@2034,95%@2036,53%@2031,50%@2033,9%@2033] EMC 31%@1600 APE 150 MSENC 1113 GR3D 0%@114
RAM 559/7851MB (lfb 1740x4MB) cpu [6%@2034,0%@2034,95%@2034,54%@2034,50%@2034,3%@2034] EMC 31%@1600 APE 150 MSENC 1113 GR3D 0%@114
RAM 559/7851MB (lfb 1740x4MB) cpu [6%@2034,0%@2035,94%@2034,54%@2035,50%@2035,3%@2033] EMC 31%@1600 APE 150 MSENC 1113 GR3D 0%@114
RAM 559/7851MB (lfb 1740x4MB) cpu [5%@2034,0%@2035,91%@2034,50%@2035,49%@2034,2%@2034] EMC 31%@1600 APE 150 MSENC 1113 GR3D 0%@114
RAM 538/7851MB (lfb 1742x4MB) cpu [2%@2034,0%@2033,0%@2036,45%@2035,48%@2036,3%@2034] EMC 27%@1600 APE 150 MSENC 1113 GR3D 0%@114

GStreamer pipeline with fpsdisplay sink (edited)

/GstPipeline:pipeline0/GstFPSDisplaySink:fpsdisplaysink0: last-message = rendered: 1833, dropped: 0, current: 59.37, average: 60.04
/GstPipeline:pipeline0/GstFPSDisplaySink:fpsdisplaysink0: last-message = rendered: 1864, dropped: 0, current: 60.43, average: 60.05
/GstPipeline:pipeline0/GstFPSDisplaySink:fpsdisplaysink0: last-message = rendered: 1895, dropped: 0, current: 60.21, average: 60.05
/GstPipeline:pipeline0/GstFPSDisplaySink:fpsdisplaysink0: last-message = rendered: 1925, dropped: 0, current: 59.90, average: 60.05
/GstPipeline:pipeline0/GstFPSDisplaySink:fpsdisplaysink0: last-message = rendered: 1955, dropped: 0, current: 60.00, average: 60.05
/GstPipeline:pipeline0/GstFPSDisplaySink:fpsdisplaysink0: last-message = rendered: 1986, dropped: 0, current: 60.10, average: 60.05
/GstPipeline:pipeline0/GstFPSDisplaySink:fpsdisplaysink0: last-message = rendered: 2016, dropped: 0, current: 59.90, average: 60.05
/GstPipeline:pipeline0/GstFPSDisplaySink:fpsdisplaysink0: last-message = rendered: 2037, dropped: 0, current: 40.55, average: 59.75
/GstPipeline:pipeline0/GstFPSDisplaySink:fpsdisplaysink0: last-message = rendered: 2057, dropped: 0, current: 40.00, average: 59.46
/GstPipeline:pipeline0/GstFPSDisplaySink:fpsdisplaysink0: last-message = rendered: 2078, dropped: 0, current: 40.02, average: 59.17
/GstPipeline:pipeline0/GstFPSDisplaySink:fpsdisplaysink0: last-message = rendered: 2099, dropped: 0, current: 39.99, average: 58.89
/GstPipeline:pipeline0/GstFPSDisplaySink:fpsdisplaysink0: last-message = rendered: 2119, dropped: 0, current: 40.00, average: 58.63
/GstPipeline:pipeline0/GstFPSDisplaySink:fpsdisplaysink0: last-message = rendered: 2139, dropped: 0, current: 39.74, average: 58.37
/GstPipeline:pipeline0/GstFPSDisplaySink:fpsdisplaysink0: last-message = rendered: 2160, dropped: 0, current: 40.28, average: 58.12

I believe the changes referenced are in my version of gstomxh254enc.c

*** gstomxh264enc.c.orig	2017-07-20 07:45:28.000000000 +0000
--- gstomxh264enc.c	2019-02-27 14:30:17.939562760 +0000
***************
*** 203,211 ****
    OMX_ERRORTYPE eError = OMX_ErrorNone;
    NVX_PARAM_VIDENCPROPERTY oEncodeProp;
    GstOMXH264Enc *self = GST_OMX_H264_ENC (enc);
  
!   if (self->insert_sps_pps) {
      GST_OMX_INIT_STRUCT (&oEncodeProp);
      oEncodeProp.nPortIndex = enc->enc_out_port->index;
  
      eError = gst_omx_component_get_index (GST_OMX_VIDEO_ENC (self)->enc,
--- 203,211 ----
    OMX_ERRORTYPE eError = OMX_ErrorNone;
    NVX_PARAM_VIDENCPROPERTY oEncodeProp;
    GstOMXH264Enc *self = GST_OMX_H264_ENC (enc);
  
!   if (TRUE) {
      GST_OMX_INIT_STRUCT (&oEncodeProp);
      oEncodeProp.nPortIndex = enc->enc_out_port->index;
  
      eError = gst_omx_component_get_index (GST_OMX_VIDEO_ENC (self)->enc,
***************
*** 217,224 ****
--- 217,225 ----
            eIndex, &oEncodeProp);
        if (eError == OMX_ErrorNone) {
          oEncodeProp.bInsertSPSPPSAtIDR = self->insert_sps_pps;
          oEncodeProp.bInsertAUD = self->insert_aud;
+ 	oEncodeProp.bSetMaxEncClock = TRUE;
  
          eError =
              gst_omx_component_set_parameter (GST_OMX_VIDEO_ENC (self)->enc,
              eIndex, &oEncodeProp);
***************
*** 380,388 ****
      }
    }
  
  
!   if (self->insert_sps_pps || self->insert_aud) {
      err = gst_omx_h264_enc_set_params (enc);
      if (err != OMX_ErrorNone) {
        GST_WARNING_OBJECT (self,
            "Error setting encode property: %s (0x%08x)",
--- 381,389 ----
      }
    }
  
  
!   if ( TRUE ) {
      err = gst_omx_h264_enc_set_params (enc);
      if (err != OMX_ErrorNone) {
        GST_WARNING_OBJECT (self,
            "Error setting encode property: %s (0x%08x)",

Note the original did not match the original in the patch. I’m not sure
what versions of gstomx1_src.tbz2 are available where.

I did try to verify that this modified version was running.

# in ~/Build/sources/gstomx1_src/gst-omx1/omx
 md5sum $(find . -name '*.so')
ca31c2f1860e5969fd81c44dcf2c0adb  ./.libs/libgstomx.so

# on the test system 
# ps auwwx | grep gst-launch
root      1943 56.4  0.2 632980 20380 pts/1    Sl+  20:13   0:05 gst-launch-1.0 v4l2src device=/dev/video5 ! capsfilter caps=video/x-raw,width=1920,height=1080,format=UYVY,framerate=60/1 ! nvvidconv ! capsfilter caps=video/x-raw(memory:NVMM),format=NV12 ! omxh264enc ! fpsdisplaysink video-sink=fakesink -v
# lsof -p 1943 | grep omx
gst-launc 1943 root  mem       REG              179,1   405064 1705851 /usr/lib/aarch64-linux-gnu/tegra/libnvomx.so
gst-launc 1943 root  mem       REG              179,1  1559720 1450474 /usr/lib/aarch64-linux-gnu/gstreamer-1.0/libgstomx.so
# md5sum /usr/lib/aarch64-linux-gnu/gstreamer-1.0/libgstomx.so
ca31c2f1860e5969fd81c44dcf2c0adb  /usr/lib/aarch64-linux-gnu/gstreamer-1.0/libgstomx.so

Note that your test pipeline:

$ gst-launch-1.0 videotestsrc is-live=1 ! video/x-raw,framerate=60/1 ! nvvidconv ! 'video/x-raw(memory:NVMM),width=1920,height=1080,format=NV12' ! omxh264enc ! fpsdisplaysink video-sink=fakesink -v

Does run at 60 fps with other encode streams running. This may be a
difference in the complexity of the data.

Let me know what else I can check.

Cary

Let me know if there is anything else you would like me to test.

Hi,
If you run

$ gst-launch-1.0 v4l2src device=/dev/video1 ! video/x-raw,width=1920,height=1080,format=UYVY,framerate=60/1 ! nvvidconv ! 'video/x-raw(memory:NVMM),format=NV12' ! fpsdisplaysink video-sink=fakesink sync=false text-overlay=false -v

And can see 3 sources outputting at 60fps steadily.

But the framerate drops when running

$ gst-launch-1.0 v4l2src device=/dev/video1 ! video/x-raw,width=1920,height=1080,format=UYVY,framerate=60/1 ! nvvidconv ! 'video/x-raw(memory:NVMM),format=NV12' ! omxh264enc ! fpsdisplaysink video-sink=fakesink sync=false text-overlay=false -v

And MSENC keeps at max frequency 1113MHz. Probably it is the performance limitation of TX2.

BTW, please run ‘sudo ./tegrastats’ to print out all information.

I was running those tests, and getting 40 fps. I found if I put a queue
element before the nvvidconv element the frame rate would to up to 60fps.
I had the proble with or without the omxh264dec element.

As I was running the tests the following error occurred:

[  901.079235] video4linux video6: vi4_channel_start_streaming -> tegra_channel_capture_setup port 0
[  901.090702] tegra-vi4 15700000.vi: tegra_channel_capture_setup++ h 1080 w 1920 data_type 30 virtual_channel 1
[  901.117568] arm-smmu 12000000.iommu: Unhandled context fault: iova=0x3e000000, fsynr=0x80011, cb=19, sid=4(0x4 - VI), pgd=0, pud=0, pmd=0, pte=0
[  901.133126] arm-smmu 12000000.iommu: Unhandled context fault: iova=0x3e3d8480, fsynr=0x80011, cb=19, sid=4(0x4 - VI), pgd=0, pud=0, pmd=0, pte=0
[  901.148687] (255) csw_viw: MC request violates VPR requirements
[  901.155490]   status = 0x00377072; addr = 0x3ffffffc0
[  901.161408]   secure: yes, access-type: write
[  901.166600] unknown mcerr fault, int_status=0x00000000, ch_int_status=0x00000000, hubc_int_status=0x00000000
[  901.178051] unknown mcerr fault, int_status=0x00000000, ch_int_status=0x00000000, hubc_int_status=0x00000000
[  901.189491] unknown mcerr fault, int_status=0x00000000, ch_int_status=0x00000000, hubc_int_status=0x00000000
[  901.200928] mc-err: Too many MC errors; throttling prints
[  901.217566] arm-smmu 12000000.iommu: Unhandled context fault: iova=0x3e400000, fsynr=0x80011, cb=19, sid=4(0x4 - VI), pgd=0, pud=0, pmd=0, pte=0
[  901.233023] arm-smmu 12000000.iommu: Unhandled context fault: iova=0x3e7d1e00, fsynr=0x80011, cb=19, sid=4(0x4 - VI), pgd=0, pud=0, pmd=0, pte=0
[  901.250894] arm-smmu 12000000.iommu: Unhandled context fault: iova=0x3e800000, fsynr=0x80011, cb=19, sid=4(0x4 - VI), pgd=0, pud=0, pmd=0, pte=0
[  901.266340] arm-smmu 12000000.iommu: Unhandled context fault: iova=0x3ebd13c0, fsynr=0x80011, cb=19, sid=4(0x4 - VI), pgd=0, pud=0, pmd=0, pte=0
[  901.284228] arm-smmu 12000000.iommu: Unhandled context fault: iova=0x3ec00000, fsynr=0x80011, cb=19, sid=4(0x4 - VI), pgd=0, pud=0, pmd=0, pte=0
[  901.299714] arm-smmu 12000000.iommu: Unhandled context fault: iova=0x3efd3100, fsynr=0x80011, cb=19, sid=4(0x4 - VI), pgd=0, pud=0, pmd=0, pte=0
[  907.251475] host1x 13e10000.host1x: v4l2src0:src: syncpoint id 18 (15340000.vic_gst-worker_0) stuck waiting 2159, timeout=-1
[  907.265203] ---- syncpts ----
[  907.269412] id 18 (15340000.vic_gst-worker_0) min 2157 max 2160 refs 3 (previous client : 15340000.vic_gst-worker_0)
[  907.282892] id 19 (154c0000.nvenc_v4l2src0:src_0) min 1438 max 1438 refs 1 (previous client : 15340000.vic_kworker/5:2_0)
[  907.296413] id 20 (tegra-vi4) min 728 max 728 refs 1 (previous client : 154c0000.nvenc_kworker/5:2_0)
[  907.299477] host1x 13e10000.host1x: v4l2src0:src: syncpoint id 23 (15340000.vic_gst-worker_0) stuck waiting 5, timeout=-1
[  907.299478] ---- syncpts ----
[  907.326025] id 21 (tegra-vi4) min 726 max 726 refs 1 (previous client : 154c0000.nvenc_v4l2src0:src_0)
[  907.337848] id 22 (15340000.vic_gst-worker_0) min 2155 max 2155 refs 1 (previous client : tegra-vi4)
[  907.349512] id 23 (15340000.vic_gst-worker_0) min 3 max 6 refs 3 (previous client : 15340000.vic_gst-worker_0)
[  907.362049] id 25 (tegra-vi4) min 8 max 8 refs 1 (previous client : tegra-vi4)
[  907.371790] id 28 (tegra-vi4) min 8 max 8 refs 1 (previous client : tegra-vi4)
[  907.382050] 
[  907.382077] id 18 (15340000.vic_gst-worker_0) min 2157 max 2160 refs 3 (previous client : 15340000.vic_gst-worker_0)
[  907.382080] id 19 (154c0000.nvenc_v4l2src0:src_0) min 1438 max 1438 refs 1 (previous client : 15340000.vic_kworker/5:2_0)
[  907.382084] id 20 (tegra-vi4) min 728 max 728 refs 1 (previous client : 154c0000.nvenc_kworker/5:2_0)
[  907.382086] id 21 (tegra-vi4) min 726 max 726 refs 1 (previous client : 154c0000.nvenc_v4l2src0:src_0)
[  907.382089] id 22 (15340000.vic_gst-worker_0) min 2155 max 2155 refs 1 (previous client : tegra-vi4)
[  907.382092] id 23 (15340000.vic_gst-worker_0) min 3 max 6 refs 3 (previous client : 15340000.vic_gst-worker_0)
[  907.382095] id 25 (tegra-vi4) min 8 max 8 refs 1 (previous client : tegra-vi4)
[  907.382100] id 28 (tegra-vi4) min 8 max 8 refs 1 (previous client : tegra-vi4)
[  907.382631]

(gst-worker is our GStreamer application, we run one per capture process).

After this I had to reset (actually power cycle) the unit to recover. After this the pipeline including the omxh264enc element ran at 60 fps. I could run 3x in parallel at 60 fps.

There is apparently some error condition that occurs that degrades performance, and eventually leads to the kernel error message above.

I am working on isolate the exact sequence of events that puts the unit into this state.

Thanks for your help, I will update you with anything relevant that I discover.

Any updates on this matter cobrien? We’re also struggling with 1/2 fps problem… The streaming fps is always 25 whereas our camera is 50 fps.

Adding the ‘queue’ element seemed to help, since this creates a new thread, and on a
multi-core device like the TX2 this could mean more CPUs could be kept busy. top -H
will break down utilization by thread. I played around with priority and thread
affinity and it didn’t seem to help.

Note that also we had one HDMI camera that proported to be 60 fps but was only able
to capture at 30 fps – each frame was duplicated once. Other sources (a GOPRO camera)
provided a true 60-fps video stream. I had to break the stream into individual jpgeg
images and compare to see this.

I believe setting the clocks faster helped too.

I’m going to mark this as the answer, since this what we did.