Find a recurring performance blocker

We are finding some hiccups in recording video. To root cause this we are using tegrastats to monitor system performance. In the tegrastats output we can see a very suspicious drop about every 45 seconds (see below).

Questions:

  1. Is there an option for tegrastats to print the absolute (or elapsed) time so that we can correlate this output with other system traces?

  2. /var/log/syslog does not show anything suspicious with this periodicity. Are there any other tools that might help us identify the offending process?

We are concurrently using atop and iostat but nothing jumps out at us.

1 RAM 1540/3983MB (lfb 20x1MB) cpu [0%,0%,0%,0%]@1734 EMC 24%@1600 APE 25 MSENC 435 GR3D 0%@998
  2 RAM 1540/3983MB (lfb 20x1MB) cpu [32%,31%,37%,30%]@1734 EMC 24%@1600 APE 25 MSENC 435 GR3D 0%@998
  3 RAM 1540/3983MB (lfb 20x1MB) cpu [36%,37%,34%,38%]@1734 EMC 24%@1600 APE 25 MSENC 435 GR3D 0%@998
  4 RAM 1539/3983MB (lfb 20x1MB) cpu [40%,31%,35%,37%]@1734 EMC 25%@1600 APE 25 MSENC 435 GR3D 0%@998
  5 RAM 1541/3983MB (lfb 20x1MB) cpu [38%,37%,30%,32%]@1734 EMC 25%@1600 APE 25 MSENC 435 GR3D 0%@998
  6 RAM 1541/3983MB (lfb 20x1MB) cpu [38%,36%,27%,32%]@1734 EMC 25%@1600 APE 25 MSENC 435 GR3D 0%@998
  7 RAM 1540/3983MB (lfb 20x1MB) cpu [36%,36%,32%,31%]@1734 EMC 25%@1600 APE 25 MSENC 435 GR3D 0%@998
  8 RAM 1541/3983MB (lfb 20x1MB) cpu [32%,35%,33%,27%]@1734 EMC 25%@1600 APE 25 MSENC 435 GR3D 0%@998
  9 RAM 1541/3983MB (lfb 20x1MB) cpu [33%,35%,31%,33%]@1734 EMC 25%@1600 APE 25 MSENC 435 GR3D 0%@998
 10 RAM 1541/3983MB (lfb 20x1MB) cpu [37%,31%,32%,30%]@1734 EMC 25%@1600 APE 25 MSENC 435 GR3D 0%@998
 11 RAM 1542/3983MB (lfb 20x1MB) cpu [36%,34%,32%,35%]@1734 EMC 25%@1600 APE 25 MSENC 435 GR3D 0%@998
 12 RAM 1542/3983MB (lfb 20x1MB) cpu [33%,33%,35%,31%]@1734 EMC 25%@1600 APE 25 MSENC 435 GR3D 0%@998
 13 RAM 1541/3983MB (lfb 20x1MB) cpu [36%,33%,27%,31%]@1734 EMC 25%@1600 APE 25 MSENC 435 GR3D 0%@998
 14 RAM 1542/3983MB (lfb 20x1MB) cpu [32%,31%,25%,32%]@1734 EMC 25%@1600 APE 25 MSENC 435 GR3D 0%@998
 15 RAM 1543/3983MB (lfb 20x1MB) cpu [35%,29%,31%,36%]@1734 EMC 25%@1600 APE 25 MSENC 435 GR3D 0%@998
 16 RAM 1542/3983MB (lfb 20x1MB) cpu [31%,36%,33%,30%]@1734 EMC 25%@1600 APE 25 MSENC 435 GR3D 0%@998
 17 RAM 1542/3983MB (lfb 20x1MB) cpu [36%,33%,28%,34%]@1734 EMC 25%@1600 APE 25 MSENC 435 GR3D 0%@998
 18 RAM 1543/3983MB (lfb 20x1MB) cpu [25%,27%,25%,22%]@1734 EMC 24%@1600 APE 25 MSENC 435 GR3D 0%@998
 19 RAM 1543/3983MB (lfb 20x1MB) cpu [9%,9%,9%,3%]@1734 EMC 19%@1600 APE 25 MSENC 435 GR3D 0%@998
 20 RAM 1543/3983MB (lfb 20x1MB) cpu [6%,4%,1%,2%]@1734 EMC 13%@1600 APE 25 GR3D 0%@998
 21 RAM 1543/3983MB (lfb 20x1MB) cpu [4%,7%,4%,5%]@1734 EMC 9%@1600 APE 25 GR3D 0%@998
 22 RAM 1543/3983MB (lfb 20x1MB) cpu [12%,10%,10%,12%]@1734 EMC 8%@1600 APE 25 MSENC 435 GR3D 0%@998
 23 RAM 1543/3983MB (lfb 20x1MB) cpu [32%,28%,29%,23%]@1734 EMC 12%@1600 APE 25 MSENC 435 GR3D 0%@998
 24 RAM 1544/3983MB (lfb 20x1MB) cpu [36%,32%,30%,32%]@1734 EMC 16%@1600 APE 25 MSENC 435 GR3D 0%@998
 25 RAM 1544/3983MB (lfb 20x1MB) cpu [29%,35%,31%,32%]@1734 EMC 19%@1600 APE 25 MSENC 435 GR3D 0%@998
 26 RAM 1544/3983MB (lfb 20x1MB) cpu [34%,38%,31%,31%]@1734 EMC 21%@1600 APE 25 MSENC 435 GR3D 0%@998
 27 RAM 1545/3983MB (lfb 20x1MB) cpu [35%,31%,30%,28%]@1734 EMC 22%@1600 APE 25 MSENC 435 GR3D 0%@998
 28 RAM 1545/3983MB (lfb 20x1MB) cpu [41%,39%,34%,29%]@1734 EMC 23%@1600 APE 25 MSENC 435 GR3D 0%@998
 29 RAM 1545/3983MB (lfb 20x1MB) cpu [32%,34%,32%,30%]@1734 EMC 23%@1600 APE 25 MSENC 435 GR3D 0%@998
 30 RAM 1546/3983MB (lfb 20x1MB) cpu [33%,32%,34%,31%]@1734 EMC 24%@1600 APE 25 MSENC 435 GR3D 0%@998
 31 RAM 1546/3983MB (lfb 20x1MB) cpu [30%,30%,29%,37%]@1734 EMC 24%@1600 APE 25 MSENC 435 GR3D 0%@998
 32 RAM 1547/3983MB (lfb 20x1MB) cpu [38%,31%,35%,25%]@1734 EMC 24%@1600 APE 25 MSENC 435 GR3D 0%@998
 33 RAM 1548/3983MB (lfb 20x1MB) cpu [37%,36%,32%,34%]@1734 EMC 25%@1600 APE 25 MSENC 290 GR3D 0%@998
 34 RAM 1548/3983MB (lfb 20x1MB) cpu [31%,34%,38%,31%]@1734 EMC 25%@1600 APE 25 MSENC 435 GR3D 0%@998
 35 RAM 1549/3983MB (lfb 20x1MB) cpu [36%,37%,36%,36%]@1734 EMC 25%@1600 APE 25 MSENC 435 GR3D 0%@998
 36 RAM 1548/3983MB (lfb 20x1MB) cpu [38%,31%,33%,39%]@1734 EMC 25%@1600 APE 25 MSENC 435 GR3D 0%@998
 37 RAM 1549/3983MB (lfb 20x1MB) cpu [34%,29%,31%,31%]@1734 EMC 25%@1600 APE 25 MSENC 435 GR3D 0%@998
 38 RAM 1549/3983MB (lfb 20x1MB) cpu [32%,33%,33%,30%]@1734 EMC 25%@1600 APE 25 MSENC 435 GR3D 0%@998
 39 RAM 1550/3983MB (lfb 20x1MB) cpu [32%,34%,32%,32%]@1734 EMC 25%@1600 APE 25 MSENC 435 GR3D 0%@998
 40 RAM 1550/3983MB (lfb 20x1MB) cpu [32%,34%,35%,34%]@1734 EMC 25%@1600 APE 25 MSENC 435 GR3D 0%@998
 41 RAM 1550/3983MB (lfb 20x1MB) cpu [36%,37%,33%,32%]@1734 EMC 25%@1600 APE 25 MSENC 435 GR3D 0%@998
 42 RAM 1551/3983MB (lfb 20x1MB) cpu [36%,37%,34%,36%]@1734 EMC 25%@1600 APE 25 MSENC 435 GR3D 0%@998
 43 RAM 1551/3983MB (lfb 20x1MB) cpu [38%,37%,32%,30%]@1734 EMC 25%@1600 APE 25 MSENC 435 GR3D 0%@998
 44 RAM 1551/3983MB (lfb 20x1MB) cpu [38%,37%,30%,27%]@1734 EMC 25%@1600 APE 25 MSENC 435 GR3D 0%@998
 45 RAM 1550/3983MB (lfb 20x1MB) cpu [39%,35%,29%,36%]@1734 EMC 26%@1600 APE 25 MSENC 435 GR3D 0%@998
 46 RAM 1551/3983MB (lfb 20x1MB) cpu [36%,33%,31%,34%]@1734 EMC 25%@1600 APE 25 MSENC 435 GR3D 0%@998
 47 RAM 1552/3983MB (lfb 20x1MB) cpu [40%,35%,37%,29%]@1734 EMC 25%@1600 APE 25 MSENC 435 GR3D 0%@998
 48 RAM 1551/3983MB (lfb 20x1MB) cpu [38%,32%,32%,35%]@1734 EMC 25%@1600 APE 25 MSENC 435 GR3D 0%@998
 49 RAM 1552/3983MB (lfb 20x1MB) cpu [32%,34%,32%,32%]@1734 EMC 25%@1600 APE 25 MSENC 435 GR3D 0%@998
 50 RAM 1551/3983MB (lfb 20x1MB) cpu [33%,35%,30%,28%]@1734 EMC 25%@1600 APE 25 MSENC 435 GR3D 0%@998
 51 RAM 1551/3983MB (lfb 20x1MB) cpu [34%,35%,30%,31%]@1734 EMC 25%@1600 APE 25 MSENC 435 GR3D 0%@998
 52 RAM 1552/3983MB (lfb 20x1MB) cpu [37%,35%,31%,33%]@1734 EMC 25%@1600 APE 25 MSENC 435 GR3D 0%@998
 53 RAM 1552/3983MB (lfb 20x1MB) cpu [37%,33%,32%,30%]@1734 EMC 25%@1600 APE 25 MSENC 435 GR3D 0%@998
 54 RAM 1552/3983MB (lfb 20x1MB) cpu [34%,35%,29%,33%]@1734 EMC 25%@1600 APE 25 MSENC 435 GR3D 0%@998
 55 RAM 1551/3983MB (lfb 20x1MB) cpu [33%,28%,33%,39%]@1734 EMC 25%@1600 APE 25 MSENC 435 GR3D 0%@998
 56 RAM 1550/3983MB (lfb 20x1MB) cpu [34%,32%,33%,27%]@1734 EMC 25%@1600 APE 25 MSENC 435 GR3D 0%@998
 57 RAM 1551/3983MB (lfb 20x1MB) cpu [38%,33%,31%,29%]@1734 EMC 25%@1600 APE 25 MSENC 435 GR3D 0%@998
 58 RAM 1551/3983MB (lfb 20x1MB) cpu [34%,28%,30%,37%]@1734 EMC 25%@1600 APE 25 MSENC 435 GR3D 0%@998
 59 RAM 1551/3983MB (lfb 20x1MB) cpu [32%,31%,34%,35%]@1734 EMC 26%@1600 APE 25 MSENC 435 GR3D 0%@998
 60 RAM 1552/3983MB (lfb 20x1MB) cpu [33%,32%,31%,30%]@1734 EMC 25%@1600 APE 25 MSENC 435 GR3D 0%@998
 61 RAM 1552/3983MB (lfb 20x1MB) cpu [37%,31%,34%,32%]@1734 EMC 26%@1600 APE 25 MSENC 435 GR3D 0%@998
 62 RAM 1551/3983MB (lfb 20x1MB) cpu [33%,34%,32%,29%]@1734 EMC 26%@1600 APE 25 MSENC 435 GR3D 0%@998
 63 RAM 1552/3983MB (lfb 20x1MB) cpu [32%,36%,30%,31%]@1734 EMC 26%@1600 APE 25 MSENC 435 GR3D 0%@998
 64 RAM 1552/3983MB (lfb 20x1MB) cpu [28%,32%,27%,28%]@1734 EMC 25%@1600 APE 25 MSENC 435 GR3D 0%@998
 65 RAM 1552/3983MB (lfb 20x1MB) cpu [29%,23%,18%,28%]@1734 EMC 21%@1600 APE 25 MSENC 435 GR3D 0%@998
 66 RAM 1552/3983MB (lfb 20x1MB) cpu [7%,4%,2%,3%]@1734 EMC 14%@1600 APE 25 GR3D 0%@998
 67 RAM 1552/3983MB (lfb 20x1MB) cpu [4%,4%,0%,4%]@1734 EMC 10%@1600 APE 25 GR3D 0%@998
 68 RAM 1552/3983MB (lfb 20x1MB) cpu [8%,5%,1%,2%]@1734 EMC 7%@1600 APE 25 GR3D 0%@998
 69 RAM 1552/3983MB (lfb 20x1MB) cpu [20%,23%,22%,22%]@1734 EMC 10%@1600 APE 25 MSENC 435 GR3D 0%@998
 70 RAM 1552/3983MB (lfb 20x1MB) cpu [31%,35%,29%,32%]@1734 EMC 14%@1600 APE 25 MSENC 435 GR3D 0%@998
 71 RAM 1552/3983MB (lfb 20x1MB) cpu [30%,32%,30%,29%]@1734 EMC 18%@1600 APE 25 MSENC 435 GR3D 0%@998

Hi sperok,
It seems to be thermal issue that HW is overheated? Do you use devkit or your custom board?

It is definitely not thermal. The TX-1 is mounted on a Leopard Imaging Carrier card and has 3 LI IMX-274 sensors attached. It is housed in an aluminum enclosure which acts as a heat sink. Three identical units are on a test jig with an external fan blowing air across them. Without the fan temperature maxes out at 87C as reported by the onboard temp sensor. WIth the fan temp is around 45C. The problem occurs repeatedly and consistently even right after power-up. As you can tell from the trace the system is running with the clocks turned up.

There are also no temp warnings in syslog.

Hi Sperok, do you also reproduce the issue with default camera ov5693?

We have not tried to reproduce with the onboard camera because the problem is occurring when we launch two concurrent gstreamer pipelines. We can’t do the same test with a single camera.

What I am looking for is some advice on tools to track down a reproducible problem that is occurring every 45 seconds or so. I don’t expect anyone to dive in and identify this specific issue for us.

To isolate this issue we are currently using atop, syslog and iostat as well as running gstreamer with various debug settings enabled.

Each pipeline run independently works fine. The order of startup does not matter. The overall system load seems OK. Tegrastats don’t indicate any overload. CPU temp as reported by But once the second pipeline is started we get this anomaly every 45 seconds or so and can’t figure out what process/thread is getting blocked. We assume something is getting blocked because performance indicators in tegrastats and atop don’t show any cpu spike.

Since the anomaly appears to last for a few seconds, I’ll create a script that monitors tegrastats output and tries to capture instantaneous system state. If there are other tools that already exist to do something like this it would be great to learn of them.

Hi sperok,
For more information, is it fixed 45 seconds for two and three cameras? What is the sensor resolution?