We’ve got our Jetson Xavier based platform up and running with our custom GMSL2 cameras. We are successfully capturing image data from our cameras at a 10fps rate via V4L2 buffers. This is done using an isolated CPU that is dedicated to image capture. Performing timing analysis on the captures, however, shows that we encounter intermittent jitter issues on the order of about 2ms. We have a critical processing time budget for our project and this 2ms jitter is causing us problems.
We’re looking for guidance at what might be inducing this and how we can best address it. Please read below for details.
In terms of setup, we’ve got jetson_clocks configured and are running nvpmodel 0 (MAXN):
nvidia@tegra-ubuntu:~/target$ sudo jetson_clocks --show
SOC family:tegra194 Machine:jetson-xavier
Online CPUs: 0-7
CPU Cluster Switching: Disabled
cpu0: Online=1 Governor=schedutil MinFreq=2265600 MaxFreq=2265600 CurrentFreq=2265600 IdleStates: C1=0 c6=0
cpu1: Online=1 Governor=schedutil MinFreq=2265600 MaxFreq=2265600 CurrentFreq=2265600 IdleStates: C1=0 c6=0
cpu2: Online=1 Governor=schedutil MinFreq=2265600 MaxFreq=2265600 CurrentFreq=2265600 IdleStates: C1=0 c6=0
cpu3: Online=1 Governor=schedutil MinFreq=2265600 MaxFreq=2265600 CurrentFreq=2265600 IdleStates: C1=0 c6=0
cpu4: Online=1 Governor=schedutil MinFreq=2265600 MaxFreq=2265600 CurrentFreq=2265600 IdleStates: C1=0 c6=0
cpu5: Online=1 Governor=schedutil MinFreq=2265600 MaxFreq=2265600 CurrentFreq=2265600 IdleStates: C1=0 c6=0
cpu6: Online=1 Governor=schedutil MinFreq=2265600 MaxFreq=2265600 CurrentFreq=2265600 IdleStates: C1=0 c6=0
cpu7: Online=1 Governor=schedutil MinFreq=2265600 MaxFreq=2265600 CurrentFreq=2265600 IdleStates: C1=0 c6=0
GPU MinFreq=1377000000 MaxFreq=1377000000 CurrentFreq=1377000000
EMC MinFreq=204000000 MaxFreq=2133000000 CurrentFreq=2133000000 FreqOverride=1
Fan: speed=255
NV Power Mode: MAXN
We’ve got highres kernel timers in place that generate a sync signal to the cameras for image capture. The sync signal generation as well as the sof/eof timing reported in the trace log do not exhibit any jitter problems.
Using ftrace (via the trace-cmd
tool), we’ve been able to see that the delta time from “tegra_channel_capture_frame: eof:” to “vb2_dqbuf:” takes an average of ~311us. When the jitter manifests itself, this time can be as much as ~2.5ms. For a 10000 frame capture, we have seen anywhere from 10-30 occurrences of these extended jitter times.
We’ve dug into the ftrace capture data and have compared a ‘normal’ (~245us) capture delta vs a jitter (~2.45ms) capture delta. Below are some ftrace snippets with very large execution deltas when a jitter issue was detected:
snippet #1
vi-outpu-18382 7...1 5932.813431: function: vb2_buffer_done
vi-outpu-18382 7...1 5932.813432: function: vb2_dc_finish
vi-outpu-18382 7...1 5932.813432: function: __iommu_sync_sg_for_cpu
vi-outpu-18382 7...1 5932.813432: function: __iommu_sync_sg_for_cpu.part.12
# the delta for the next line is 5932.814313 - 5932.813432 = 881us!
vi-outpu-18382 7d..1 5932.814313: function: __handle_domain_irq
vi-outpu-18382 7d..1 5932.814316: function: irq_enter
vi-outpu-18382 7d..1 5932.814316: function: rcu_irq_enter
snippet #2:
vi-outpu-18382 7d.s1 5932.814385: function: irqtime_account_irq
vi-outpu-18382 7d.s1 5932.814385: function: irqtime_account_delta.isra.0
vi-outpu-18382 7d.s1 5932.814386: function: __local_bh_enable
vi-outpu-18382 7d..1 5932.814386: function: idle_cpu
vi-outpu-18382 7d..1 5932.814386: function: rcu_irq_exit
# the delta for the next line is 5932.815317 - 5932.814386 = 931us!
vi-outpu-18382 7d..1 5932.815317: function: __handle_domain_irq
vi-outpu-18382 7d..1 5932.815318: function: irq_enter
vi-outpu-18382 7d..1 5932.815318: function: rcu_irq_enter
vi-outpu-18382 7d..1 5932.815318: function: irqtime_account_irq
vi-outpu-18382 7d.h1 5932.815318: function: irq_find_mapping
vi-outpu-18382 7d.h1 5932.815319: function: generic_handle_irq
vi-outpu-18382 7d.h1 5932.815319: function: handle_percpu_devid_irq
vi-outpu-18382 7d.h. 5932.815320: irq_handler_entry: irq=2 name=arch_timer
snippet #3
vi-outpu-18382 7d.s1 5932.815378: function: irqtime_account_irq
vi-outpu-18382 7d.s1 5932.815378: function: irqtime_account_delta.isra.0
vi-outpu-18382 7d.s1 5932.815378: function: __local_bh_enable
vi-outpu-18382 7d..1 5932.815379: function: idle_cpu
vi-outpu-18382 7d..1 5932.815379: function: rcu_irq_exit
# the delta for the next line is 5932.815777 - 5932.815379 = 398us!
vi-outpu-18382 7...1 5932.815777: function: _raw_spin_lock_irqsave
vi-outpu-18382 7d..2 5932.815778: function: _raw_spin_unlock_irqrestore
vi-outpu-18382 7.... 5932.815783: vb2_buf_done: owner = 0xffffffc3deaf7c80, queued = 10, owned_by_drv = 9, index = 1, type = 1, bytesused = 8294400, timestamp = 5932581240250
Summing the worst delta times from above gives: 881us + 931us + 398us = 2.21ms!
Are there any hints or guidance on what may be inducing this? At a guess, there might be some hardware-level operations going on (DMA perhaps?) that are causing the time delays in the ftrace. It also could be that we need to enable additional ftrace options in the kernel config to fill in what’s going on above. While ftrace is helping out greatly, it’s not painting to whole picture for us quite yet.
I’ve got a 600 frame capture trace that the snippets above were pulled from that I can provide. It is, however, too large for me to upload to the forum, or I’d just attach it here.
Thanks.