Description
“I have about 2000 Jetson Nano devices using NVIDIA Tegra X1 to run AI models for camera devices. However, after about two years of operation, I am currently experiencing issues with around 20 devices that are encountering GPU hangs. After running for a period, I find that I cannot load the AI model from my firmware to the GPU, and I also see that some interrupts like IRQ/79-gk20a_st are hanging as well. I have tried everything to kill or restart them, but to no avail. The only solution I have is to reboot them, but the issue reoccurs after just 1-2 days. Could you provide me with a solution?”
Environment
root@ubuntu:/home/nano# jetson_release
Software part of jetson-stats 4.2.12 - (c) 2024, Raffaello Bonghi
Model: NVIDIA Jetson Nano Developer Kit - Jetpack 4.6.5 [L4T 32.7.5]
NV Power Mode[0]: MAXN
Serial Number: [XXX Show with: jetson_release -s XXX]
Hardware:
- P-Number: p3448-0002
- Module: NVIDIA Jetson Nano module (16Gb eMMC)
Platform: - Distribution: Ubuntu 18.04 Bionic Beaver
- Release: 4.9.337-tegra
jtop: - Version: 4.2.12
- Service: Active
Libraries: - CUDA: 10.2.300
- cuDNN: 8.2.1.32
- TensorRT: 8.2.1.9
- VPI: 1.2.3
- Vulkan: 1.2.70
- OpenCV: 4.1.1 - with CUDA: NO