Hi,
We’re experiencing nvargus-daemon crashes after 20-45 minutes of continuous streaming with 8 GMSL2 cameras on Jetson AGX Orin. We’ve done extensive testing and narrowed this down to
known nvargus/kernel issues.
We’re currently on L4T 36.4.4 (JetPack 6.2.1) and plan to OTA upgrade to L4T 36.5.0 (JetPack 6.2.2), since @JerryChang confirmed in topic 325399 comment
#9 that the host1x-fence memory leak fix is included in r36.5.0. However, the
r36.5.0 release notes don’t explicitly mention camera stability fixes, so we’d
like to confirm whether all the relevant patches are included, or if additional library replacements are still needed after upgrading.
Hardware
- Jetson AGX Orin 64GB Developer Kit
- 4x Stereolabs ZED X stereo cameras (8 GMSL2 streams total) via GMSL2 capture board
- All cameras streaming 1080p30 simultaneously
Software
- L4T 36.4.4 (JetPack 6.2.1), installed June 2025
- ZED SDK 5.x
enableCamInfiniteTimeout=1already set in nvargus-daemon service
Current library versions (L4T 36.4.4)
md5: 142547f5 /usr/lib/aarch64-linux-gnu/nvidia/libnvargus.so (1.0 MB)
md5: 8212ca9b /usr/lib/aarch64-linux-gnu/nvidia/libnvargus_socketserver.so (1.8 MB)
md5: ca971fd1 /usr/lib/aarch64-linux-gnu/nvidia/libnvscf.so (7.8 MB)
Symptoms
Two distinct failure modes, both after 20-45 minutes of continuous 8-camera streaming:
Mode 1 — FUSA VI handler SEGV:
nvargus-daemon crashes with SIGSEGV (signal 11). Journal shows:
SCF: Error InvalidState: (propagating from /dvs/git/…/fusaViHandler.cpp, function handleCorrErrors(), line 817)
SCF: Error InvalidState: Corr Error 8 for sensor ID X
Followed by daemon crash and systemd restart. All camera sessions are invalidated after restart.
Mode 2 — Camera FAILURE without daemon crash:
ZED SDK reports FAILURE in sl::Camera::grab() and CAMERA REBOOTING for all cameras simultaneously. Frame counts stop increasing. nvargus-daemon continues running (no SEGV in
journal).
In both cases, the only recovery is restarting nvargus-daemon + reopening all ZED camera sessions.
Test results
We ran extensive stress tests:
| Test | Result |
|---|---|
| 10s recordings x30 | 30/30 PASS |
| 15-min recordings x8 (fresh nvargus restart before each batch) | 8/8 PASS |
| 15-min recordings x15 (no nvargus restart between batches) | 3/15 PASS — fails after ~30-45 min cumulative |
| 30-min continuous recording x3 | 0/3 PASS — fails at 8-22 min |
Short recordings with stop/start cycles are 100% reliable. The failure correlates with cumulative continuous streaming time, not recording count.
What we’ve checked on L4T 36.4.4
enableCamInfiniteTimeout=1— already enabledhost1x-fence.ko— our module does not containdma_fence_putsymbol, confirming the memory leak fix from topic
325399 is missingcapture-ivc.ko— our module does not contain semaphore symbols (sema_init,down_timeout), suggesting the multi-camera race condition fix referenced in topic
355130 is missinglibnvscf.so— our version is 7.8 MB, while the patched version from topic
355130 is 8.5 MB
Questions
We plan to OTA upgrade to JetPack 6.2.2 (L4T r36.5.0). Before we do, can you confirm whether r36.5.0 includes the following fixes:
host1x-fencememory leak —dma_fence_putfix (topic 325399). We
believe yes per comment #9, but the r36.5.0 release notes don’t list it.capture-ivcsemaphore — multi-camera race condition fix referenced in topic
355130- Updated
libnvscf.so— infinite timeout / long-run stability fix from topic
355130 - Updated
libnvargus.so/libnvargus_socketserver.so— stability patches previously distributed via forum threads (topic
324126, topic
320518)
If any of the above are not included in r36.5.0, could you provide patched binaries for that release?
Thank you!