Video Glitches When Encoding Custom appsrc in GStreamer

Hi there, I hope you will be able to assist me with a GStreamer issue that I’ve been encountering on my TX2 NX board.

I’ve been developing an application to encode 100fps video from a Basler industrial camera to a file. I’m using GStreamers appsrc to push image buffers into a pipeline which is using nvv4l2h264enc to encode and write the video to an SD card.

This is working and the video has the right quality and framerate, but I am finding that it will occasionally glitch for several seconds, where it will jump forwards and backwards through frames, as well as exhibiting some frame tearing. I haven’t been able to diagnose the cause of this, because it seems to be inconsistent with sometimes no issue for minutes in a video, while other videos might be glitchy almost from the start. This issue also occurs if I write to the Jetson’s internal memory instead of the SD card.

I have noticed that the CPU usage starts creeping upwards when encoding to sit around 90%. I have also connected the “need-data” signal to my pipeline’s appsrc, and I have noticed that it will stop signalling at the same time as when the glitches start occuring, however the “enough-data” signal never triggers. I am quite a novice to GStreamer so I’m not sure how to determine if the pipeline is running too slowly or how I can optimise it.

This is my code that handles sending frame data to the pipeline:

void VideoEncoder::PushFrame(Pylon::CGrabResultPtr ptrGrabResult) {
    Pylon::CPylonImage pylonImage;

    pylonImage.AttachGrabResultBuffer(ptrGrabResult);

    GstBuffer* buffer = gst_buffer_new_wrapped_full(
      GST_MEMORY_FLAG_PHYSICALLY_CONTIGUOUS,
      pylonImage.GetBuffer(),
      pylonImage.GetImageSize(),
      0,
      pylonImage.GetImageSize(),
      NULL,
      NULL
    );

    gst_buffer_add_video_meta(
      buffer,
      GST_VIDEO_FRAME_FLAG_NONE,
      GST_VIDEO_FORMAT_GRAY8,
      pylonImage.GetWidth(),
      pylonImage.GetHeight()
    );

    GstFlowReturn ret;
    
    if (web_app_source != NULL) { // Streaming to web
      g_signal_emit_by_name(web_app_source, "push-buffer", buffer, &ret);
    } else {
      g_signal_emit_by_name(app_source, "push-buffer", buffer, &ret);
    }
    if (ret != GST_FLOW_OK) {
      // PLOG_ERROR << "Error: " << ret;
      gst_buffer_unref(buffer);
    }
    pylonImage.Release();
    gst_buffer_unref(buffer);
}

My GStreamer pipeline looks like this (I’m using gst_parse_launch to create it):
appsrc name=appsrc ! nvvidconv ! nvv4l2h264enc name=nvv4l2h264enc maxperf-enable=1 bitrate=128000000 ! h264parse ! matroskamux ! filesink name=filesink location=/media/sd/video_record.mp4

Any help or ideas as to how I can diagnose and improve the performance would be greatly appreciated as this is the last major hurdle I am experiencing!

Kind regards,
Adam

Hi,
There are some suggestions for further debugging:

  1. Run VIC at maximum clock:
    Nvvideoconvert issue, nvvideoconvert in DS4 is better than Ds5? - #3 by DaneLLL
  2. Run sudo jetson_clocks to enable CPU cores at maximum clock
  3. Run sudo tegrastats to check system status. Probably it hits throttling or over current in the condition
  4. Try with fakesink sync=0. To check if it occurs without writing to a file
    ... ! matroskamux ! fakesink sync=0

Hi Dane,

Thank you for your assistance, I set the clocks as you recommended but the issue appeared unaffected.

When using fakesink it looks like the issue gets a lot better with the need-data signal being sent out constantly, although I have noticed some lag in it where it can pause for almost 100ms.

Should I expect to be able to write a video file without encountering throttling like this? With the bitrate I have my video is about 16MB/s. It appears that I do get a video of the correct size after n seconds of recording, but with the aforementioned glitches which makes me think it isn’t writing to the file which is the issue, but using fakesink may have indicated otherwise!

This is a portion of what tegrastats logs when the video issues occur; it looks like the CPU load decreases at this time but I can’t make much of what it means.

RAM 1842/3834MB (lfb 63x4MB) SWAP 0/1917MB (cached 0MB) CPU [48%@2035,0%@345,0%@345,20%@2035,32%@2035,32%@2035] EMC_FREQ 0% GR3D_FREQ 0% PLL@81C MCPU@81C PMIC@50C GPU@77C BCPU@81C thermal@79.9C
RAM 1845/3834MB (lfb 62x4MB) SWAP 0/1917MB (cached 0MB) CPU [36%@2035,0%@345,0%@345,29%@2035,36%@2035,38%@2035] EMC_FREQ 0% GR3D_FREQ 0% PLL@81.5C MCPU@81.5C PMIC@50C GPU@77.5C BCPU@81.5C thermal@79.9C
RAM 1863/3834MB (lfb 62x4MB) SWAP 0/1917MB (cached 0MB) CPU [57%@2035,0%@345,0%@345,55%@2035,72%@2035,64%@2035] EMC_FREQ 0% GR3D_FREQ 0% PLL@83.5C MCPU@83.5C PMIC@50C GPU@78C BCPU@83.5C thermal@80.3C
RAM 1867/3834MB (lfb 62x4MB) SWAP 0/1917MB (cached 0MB) CPU [82%@2035,0%@345,0%@345,70%@2035,83%@2035,82%@2035] EMC_FREQ 0% GR3D_FREQ 0% PLL@85C MCPU@85C PMIC@50C GPU@78.5C BCPU@85C thermal@82.1C
RAM 1867/3834MB (lfb 62x4MB) SWAP 0/1917MB (cached 0MB) CPU [65%@2035,0%@345,0%@345,61%@2035,62%@2035,67%@2035] EMC_FREQ 0% GR3D_FREQ 0% PLL@85C MCPU@85C PMIC@50C GPU@79.5C BCPU@85C thermal@82.3C
RAM 1874/3834MB (lfb 62x4MB) SWAP 0/1917MB (cached 0MB) CPU [66%@2035,0%@345,0%@345,70%@2035,63%@2035,69%@2035] EMC_FREQ 0% GR3D_FREQ 0% PLL@85.5C MCPU@85.5C PMIC@50C GPU@80C BCPU@85.5C thermal@83.1C
RAM 1883/3834MB (lfb 62x4MB) SWAP 0/1917MB (cached 0MB) CPU [69%@2035,0%@345,0%@345,66%@2035,76%@2035,66%@2035] EMC_FREQ 0% GR3D_FREQ 0% PLL@86C MCPU@86C PMIC@50C GPU@80C BCPU@86C thermal@84.1C
RAM 1889/3834MB (lfb 62x4MB) SWAP 0/1917MB (cached 0MB) CPU [63%@2035,0%@345,0%@345,85%@2035,66%@2035,61%@2035] EMC_FREQ 0% GR3D_FREQ 0% PLL@86.5C MCPU@86.5C PMIC@50C GPU@80.5C BCPU@86.5C thermal@84.1C
RAM 1898/3834MB (lfb 62x4MB) SWAP 0/1917MB (cached 0MB) CPU [71%@2035,0%@345,0%@345,66%@2035,72%@2035,81%@2035] EMC_FREQ 0% GR3D_FREQ 0% PLL@87C MCPU@87C PMIC@50C GPU@81C BCPU@87C thermal@84.1C
RAM 1911/3834MB (lfb 62x4MB) SWAP 0/1917MB (cached 0MB) CPU [72%@2035,0%@345,0%@345,66%@2035,71%@2035,78%@2035] EMC_FREQ 0% GR3D_FREQ 0% PLL@87.5C MCPU@87.5C PMIC@50C GPU@81.5C BCPU@87.5C thermal@84.6C
RAM 1915/3834MB (lfb 62x4MB) SWAP 0/1917MB (cached 0MB) CPU [67%@2035,0%@345,0%@345,75%@2035,61%@2035,76%@2035] EMC_FREQ 0% GR3D_FREQ 0% PLL@87.5C MCPU@87.5C PMIC@50C GPU@82C BCPU@87.5C thermal@85.1C
RAM 1929/3834MB (lfb 62x4MB) SWAP 0/1917MB (cached 0MB) CPU [68%@2035,0%@345,0%@345,71%@2035,67%@2035,73%@2035] EMC_FREQ 0% GR3D_FREQ 0% PLL@88C MCPU@88C PMIC@50C GPU@82C BCPU@88C thermal@85.6C
RAM 1926/3834MB (lfb 62x4MB) SWAP 0/1917MB (cached 0MB) CPU [84%@2035,0%@2035,0%@2035,83%@2035,71%@2035,72%@2035] EMC_FREQ 0% GR3D_FREQ 0% PLL@88.5C MCPU@88.5C PMIC@50C GPU@82.5C BCPU@88.5C thermal@86.4C
RAM 1934/3834MB (lfb 62x4MB) SWAP 0/1917MB (cached 0MB) CPU [70%@2035,0%@345,0%@345,87%@2035,68%@2035,71%@2035] EMC_FREQ 0% GR3D_FREQ 0% PLL@88.5C MCPU@88.5C PMIC@50C GPU@82.5C BCPU@88.5C thermal@86.1C
RAM 1938/3834MB (lfb 62x4MB) SWAP 0/1917MB (cached 0MB) CPU [65%@2035,0%@345,0%@345,69%@2035,77%@2035,71%@2035] EMC_FREQ 0% GR3D_FREQ 0% PLL@89C MCPU@89C PMIC@50C GPU@83C BCPU@89C thermal@86.6C
RAM 1940/3834MB (lfb 62x4MB) SWAP 0/1917MB (cached 0MB) CPU [65%@2035,0%@345,0%@345,56%@2035,61%@2035,63%@2035] EMC_FREQ 0% GR3D_FREQ 0% PLL@87.5C MCPU@87.5C PMIC@50C GPU@83C BCPU@87.5C thermal@86.9C
RAM 1909/3834MB (lfb 61x4MB) SWAP 0/1917MB (cached 0MB) CPU [56%@2035,0%@345,0%@345,48%@2035,64%@2035,56%@2035] EMC_FREQ 0% GR3D_FREQ 0% PLL@88.5C MCPU@88.5C PMIC@50C GPU@83C BCPU@88.5C thermal@86.6C
RAM 1908/3834MB (lfb 61x4MB) SWAP 0/1917MB (cached 0MB) CPU [51%@2035,0%@345,0%@345,38%@2035,55%@2035,47%@2035] EMC_FREQ 0% GR3D_FREQ 0% PLL@88.5C MCPU@88.5C PMIC@50C GPU@83.5C BCPU@88.5C thermal@86.6C
RAM 1910/3834MB (lfb 60x4MB) SWAP 0/1917MB (cached 0MB) CPU [73%@2035,0%@345,0%@345,72%@2035,64%@2035,70%@2035] EMC_FREQ 0% GR3D_FREQ 0% PLL@89.5C MCPU@89.5C PMIC@50C GPU@83.5C BCPU@89.5C thermal@86.5C
RAM 1911/3834MB (lfb 57x4MB) SWAP 0/1917MB (cached 0MB) CPU [65%@2035,0%@345,0%@345,66%@2035,64%@2035,66%@2035] EMC_FREQ 0% GR3D_FREQ 0% PLL@90C MCPU@90C PMIC@50C GPU@83.5C BCPU@90C thermal@86.9C
RAM 1913/3834MB (lfb 53x4MB) SWAP 0/1917MB (cached 0MB) CPU [63%@2035,0%@345,0%@345,64%@2035,65%@2035,68%@2035] EMC_FREQ 0% GR3D_FREQ 0% PLL@90C MCPU@90C PMIC@50C GPU@84C BCPU@90C thermal@87.6C
RAM 1914/3834MB (lfb 48x4MB) SWAP 0/1917MB (cached 0MB) CPU [68%@2035,0%@345,0%@345,62%@2035,61%@2035,70%@2035] EMC_FREQ 0% GR3D_FREQ 0% PLL@90.5C MCPU@90.5C PMIC@50C GPU@84C BCPU@90.5C thermal@87.6C
RAM 1914/3834MB (lfb 46x4MB) SWAP 0/1917MB (cached 0MB) CPU [64%@2035,0%@345,0%@345,67%@2035,68%@2035,63%@2035] EMC_FREQ 0% GR3D_FREQ 0% PLL@90.5C MCPU@90.5C PMIC@50C GPU@84.5C BCPU@90.5C thermal@88.1C
RAM 1917/3834MB (lfb 42x4MB) SWAP 0/1917MB (cached 0MB) CPU [73%@2035,0%@345,0%@345,66%@2035,67%@2035,74%@2035] EMC_FREQ 0% GR3D_FREQ 0% PLL@90.5C MCPU@90.5C PMIC@50C GPU@85C BCPU@90.5C thermal@88.4C
RAM 1917/3834MB (lfb 37x4MB) SWAP 0/1917MB (cached 0MB) CPU [65%@2035,0%@345,0%@345,64%@2035,65%@2035,65%@2035] EMC_FREQ 0% GR3D_FREQ 0% PLL@91C MCPU@91C PMIC@50C GPU@85C BCPU@91C thermal@88.3C
RAM 1916/3834MB (lfb 35x4MB) SWAP 0/1917MB (cached 0MB) CPU [60%@2035,0%@345,0%@345,61%@2035,65%@2035,59%@2035] EMC_FREQ 0% GR3D_FREQ 0% PLL@91C MCPU@91C PMIC@50C GPU@85C BCPU@91C thermal@88.6C
RAM 1920/3834MB (lfb 30x4MB) SWAP 0/1917MB (cached 0MB) CPU [73%@2035,0%@345,0%@345,71%@2035,66%@2035,63%@2035] EMC_FREQ 0% GR3D_FREQ 0% PLL@91.5C MCPU@91.5C PMIC@50C GPU@85.5C BCPU@91.5C thermal@89.1C
RAM 1928/3834MB (lfb 25x4MB) SWAP 0/1917MB (cached 0MB) CPU [61%@2035,0%@345,0%@345,92%@2035,66%@2035,56%@2035] EMC_FREQ 0% GR3D_FREQ 0% PLL@92C MCPU@92C PMIC@50C GPU@86C BCPU@92C thermal@89.1C
RAM 1916/3834MB (lfb 24x4MB) SWAP 0/1917MB (cached 0MB) CPU [60%@2035,0%@345,0%@345,63%@2035,55%@2035,58%@2035] EMC_FREQ 0% GR3D_FREQ 0% PLL@91.5C MCPU@91.5C PMIC@50C GPU@85.5C BCPU@91.5C thermal@89.6C
RAM 1916/3834MB (lfb 20x4MB) SWAP 0/1917MB (cached 0MB) CPU [61%@2035,0%@345,0%@345,59%@2035,59%@2035,63%@2035] EMC_FREQ 0% GR3D_FREQ 0% PLL@91.5C MCPU@91.5C PMIC@50C GPU@85.5C BCPU@91.5C thermal@89.3C
RAM 1919/3834MB (lfb 16x4MB) SWAP 0/1917MB (cached 0MB) CPU [67%@2035,0%@345,0%@345,63%@2035,70%@2035,73%@2035] EMC_FREQ 0% GR3D_FREQ 0% PLL@92C MCPU@92C PMIC@50C GPU@86C BCPU@92C thermal@89.6C
RAM 1909/3834MB (lfb 14x4MB) SWAP 0/1917MB (cached 0MB) CPU [35%@2035,0%@345,0%@345,38%@2035,47%@2035,32%@2035] EMC_FREQ 0% GR3D_FREQ 0% PLL@91C MCPU@91C PMIC@50C GPU@86C BCPU@91C thermal@89.6C

Thanks again for your help.

Kind regards,
Adam

Hi,
Please try to feed frame data like:
Latency issue: nvv4l2h265enc accumulates four images before releasing the first - #3 by DaneLLL

Not sure but need-data signal looks to be periodic. There may be improvement if you feed the frame as soon as you have it.

Hi Dane,

I’m not completely sure what you’re proposing? I’m emitting the “push-buffer” signal as soon as the next frame is available (the PushFrame function is called by the camera at a constant 10ms period). I have just been logging need-data as part of my debugging, but I’m not performing any actions based on it!

I added enabled the MeasureEncoderLatency feature of nvv4l2h264enc and it looks like it’s perhaps not able to keep up with the frame rate? If the encoder ms is accurate it seems to struggle to maintain 10ms, and in the glitching periods I have seen it increase to over 20ms.

Are there any optimisations I can make to decrease this encoding time? Keeping a high fps and bit rate is pretty critical for our application.


KPI: v4l2: frameNumber= 4203 encoder= 11 ms pts= 42030000000

KPI: v4l2: frameNumber= 4204 encoder= 17 ms pts= 42040000000

KPI: v4l2: frameNumber= 4205 encoder= 13 ms pts= 42050000000

KPI: v4l2: frameNumber= 4206 encoder= 12 ms pts= 42060000000

KPI: v4l2: frameNumber= 4207 encoder= 12 ms pts= 42070000000

KPI: v4l2: frameNumber= 4208 encoder= 11 ms pts= 42080000000

KPI: v4l2: frameNumber= 4209 encoder= 11 ms pts= 42090000000

KPI: v4l2: frameNumber= 4210 encoder= 11 ms pts= 42100000000

KPI: v4l2: frameNumber= 4211 encoder= 12 ms pts= 42110000000

KPI: v4l2: frameNumber= 4212 encoder= 20 ms pts= 42120000000

KPI: v4l2: frameNumber= 4213 encoder= 14 ms pts= 42130000000

KPI: v4l2: frameNumber= 4214 encoder= 12 ms pts= 42140000000

KPI: v4l2: frameNumber= 4215 encoder= 12 ms pts= 42150000000

KPI: v4l2: frameNumber= 4216 encoder= 17 ms pts= 42160000000

KPI: v4l2: frameNumber= 4217 encoder= 18 ms pts= 42170000000

KPI: v4l2: frameNumber= 4218 encoder= 19 ms pts= 42180000000

KPI: v4l2: frameNumber= 4219 encoder= 13 ms pts= 42190000000

KPI: v4l2: frameNumber= 4220 encoder= 13 ms pts= 42200000000

KPI: v4l2: frameNumber= 4221 encoder= 15 ms pts= 42210000000

KPI: v4l2: frameNumber= 4222 encoder= 15 ms pts= 42220000000

KPI: v4l2: frameNumber= 4223 encoder= 12 ms pts= 42230000000

Kind regards,
Adam

Hi,
Please try the suggested items:

  1. Set poc-type=2
 poc-type            : Set Picture Order Count type value
                       flags: readable, writable, changeable only in NULL or READY state
                       Unsigned Integer. Range: 0 - 2 Default: 0
  1. Enable slice encoding
  slice-header-spacing: Slice Header Spacing number of macroblocks/bits in one packet
                        flags: readable, writable, changeable only in NULL or READY state
                        Unsigned Integer64. Range: 0 - 18446744073709551615 Default: 0
  1. Try nvv4l2h265enc

Hi Dane,

Thank you for the suggestions. I tried using the 265 encoder with the settings you provided (I set slice-header-spacing to 2304000 bits which is the size of one frame - not sure if there’s a better value for this?) but unfortunately the problem is still there. Again it’s really odd, as sometimes I can encode for minutes before getting the stuttering and frame dropping, and other times it happens almost immediately.

I’m afraid I’m at the limit of my technical ability solving this. Am I being unreasonable in what I am asking the Jetson to do? I’ve created a minimal application that only pushes frames and encodes them, and the problem is still there.

Thank you again for all your help up until this point!

Kind regards,
Adam

Hi,
Please run sudo tegrastats and see if there is further clue. Generally we encode in 30fps or 60fps. Encoding in 100fps may be hitting the capability of single CPU core and there may be latency in scheduling.

And do you use Jetpack 4.6.4 and would like to encode 1920x1080 100fps?

Hi Dane,

I’ve attached tegrastats for a minute of footage where glitching occurred quite a lot. I notice that VIC_FREQ is changing quite a lot, could that be an indicator of anything?

I’m on Jetpack 4.6.2 because that’s the latest BSP provided by the carrier board manufacturer, and I’m attempting to encode 1920x1200 grayscale at 100fps with an ideal bitrate of 128Mb/s. Is there a way that I can use additional CPU cores, or perhaps buffer the raw video data? I don’t need the encoding to be real-time as the camera will not be on continuously, but any frame dropping or the glitching issues will mess up our synchronisation of the video feed with other data feeds.

If I’m hitting the limits of performance I may need to look at reducing the bitrate and potentially cropping the video as there are some regions I can discard.

Thank you again for your help, this has been really useful!

tegrastats_dump.txt (25.3 KB)

Out of interest, I placed fpsdisplaysink after each element in the pipeline, and nvvidconv appears to be struggling to maintain 100fps averaging 96fps but dropping down to 89fps. But placing it after nvv4l2h264enc the framerate stabilises back to near constant 100fps. The frame drop rate for both was 0.

Is it possible that nvvidconv is causing some kind of issue? I’m setting the following caps to my appsrc:

    gst_caps_new_simple("video/x-raw",
    "format", G_TYPE_STRING, "GRAY8",
    "width", G_TYPE_INT, 1920,
    "height", G_TYPE_INT, 1200,
    "framerate", GST_TYPE_FRACTION, 100, 1, NULL
    ),
    NULL
  );

I am also applying the fix for washed out GRAY8 sources that you suggested here: GRAY8 source with nvvidconv and nvv4l2h264enc produces video with wrong color format (washed out) - Jetson & Embedded Systems / Jetson Nano - NVIDIA Developer Forums

Hi,
The current pipeline is

appsrc name=appsrc !  'video/x-raw,format=GRAY8' ! nvvidconv ! 'video/x-raw(memory:NVMM),format=I420' ! ...

Please check if you can run like:

appsrc name=appsrc !  'video/x-raw,format=NV12' ! nvvidconv ! 'video/x-raw(memory:NVMM),format=NV12' ! ...

For converting GRAY8 to I420, it has to do gst_nvvconv_do_clearchroma() and it is on CPU. This may be bottleneck. You may try to feed frame data in NV12 to avoid this function, and see if there is improvement.

And please increase buffer number of nvvidconv and try:

  output-buffers      : number of output buffers
                        flags: readable, writable, changeable in NULL, READY, PAUSED or PLAYING state
                        Unsigned Integer. Range: 1 - 4294967295 Default: 4

Hi Dane,

Thank you for the suggestion. I have managed to get my raw frames in NV12 format, and I have found that if I put queue max-size-buffers=0 max-size-bytes=0 max-size-time=0 just before my filesink I think it has finally eliminated the glitching.

The queue seems to fill up at times with 200+ buffers before emptying, so I guess writing to the file is causing some throttling and the issues. It has been a bit annoying that I cannot find any warnings or messages to indicate that frames are being dropped!

I am continuing to perform some stress tests, but for now I will mark your suggestion as a solution as I think this is probably the best I can get with the TX2.

Thanks,
Adam

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.