Hi,
I’m facing an issue on an Nvidia Jetson Orin Nano where the GPU is not being detected.
PyTorch says that cuda is not available:
fov@marvel-fov-8:~$ python
Python 3.8.10 (default, May 26 2023, 14:05:08)
[GCC 9.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> print("CUDA is available" if torch.cuda.is_available() else "CUDA is not available")
CUDA is not available
The logs from running sudo jtop
say that no GPUs are available:
May 22 14:15:59 marvel-fov-8 systemd[1]: Started jtop service.
May 22 14:15:59 marvel-fov-8 systemd[472530]: jtop.service: Failed to execute command: No such file or directory
May 22 14:15:59 marvel-fov-8 systemd[472530]: jtop.service: Failed at step EXEC spawning /usr/local/bin/jtop: No such file or directory
May 22 14:15:59 marvel-fov-8 systemd[1]: jtop.service: Main process exited, code=exited, status=203/EXEC
May 22 14:15:59 marvel-fov-8 systemd[1]: jtop.service: Failed with result 'exit-code'.
May 22 14:16:09 marvel-fov-8 systemd[1]: jtop.service: Scheduled restart job, restart counter is at 1.
May 22 14:16:09 marvel-fov-8 systemd[1]: Stopped jtop service.
May 22 14:16:09 marvel-fov-8 systemd[1]: Started jtop service.
May 22 14:16:09 marvel-fov-8 jtop[472547]: [INFO] jtop.core.config - Build service folder in /usr/local/jtop
May 22 14:16:09 marvel-fov-8 jtop[472547]: [INFO] jtop.service - jetson_stats 4.2.8 - server loaded
May 22 14:16:10 marvel-fov-8 jtop[472547]: [INFO] jtop.service - Running on Python: 3.8.10
May 22 14:16:10 marvel-fov-8 jtop[472547]: [INFO] jtop.core.hardware - Hardware detected aarch64
May 22 14:16:10 marvel-fov-8 jtop[472547]: [INFO] jtop.core.hardware - NVIDIA Jetson 699-level Part Number=699-13767-0005-300 K.2
May 22 14:16:10 marvel-fov-8 jtop[472547]: [INFO] jtop.core.hardware - NVIDIA Jetson Module=NVIDIA Jetson Orin Nano (Developer kit)
May 22 14:16:10 marvel-fov-8 jtop[472547]: [INFO] jtop.core.hardware - NVIDIA Jetson detected L4T=35.3.1
May 22 14:16:10 marvel-fov-8 jtop[472547]: [INFO] jtop.core.cpu - Found 6 CPU
**May 22 14:16:10 marvel-fov-8 jtop[472547]: [WARNING] jtop.core.gpu - No NVIDIA GPU available**
May 22 14:16:10 marvel-fov-8 jtop[472547]: [INFO] jtop.core.processes - Process service started
May 22 14:16:10 marvel-fov-8 jtop[472547]: [INFO] jtop.core.memory - Found EMC!
May 22 14:16:10 marvel-fov-8 jtop[472547]: [INFO] jtop.core.memory - Memory service started
May 22 14:16:10 marvel-fov-8 jtop[472547]: [INFO] jtop.core.engine - Engines found: [APE NVDEC NVENC NVJPG OFA SE VIC]
May 22 14:16:10 marvel-fov-8 jtop[472547]: [INFO] jtop.core.temperature - Found thermal "CV0" in thermal_zone2
May 22 14:16:10 marvel-fov-8 jtop[472547]: [INFO] jtop.core.temperature - Found thermal "CPU" in thermal_zone0
May 22 14:16:10 marvel-fov-8 jtop[472547]: [INFO] jtop.core.temperature - Found thermal "SOC2" in thermal_zone7
May 22 14:16:10 marvel-fov-8 jtop[472547]: [INFO] jtop.core.temperature - Found thermal "SOC0" in thermal_zone5
May 22 14:16:10 marvel-fov-8 jtop[472547]: [INFO] jtop.core.temperature - Found thermal "CV1" in thermal_zone3
May 22 14:16:10 marvel-fov-8 jtop[472547]: [INFO] jtop.core.temperature - Found thermal "GPU" in thermal_zone1
May 22 14:16:10 marvel-fov-8 jtop[472547]: [INFO] jtop.core.temperature - Found thermal "tj" in thermal_zone8
May 22 14:16:10 marvel-fov-8 jtop[472547]: [INFO] jtop.core.temperature - Found thermal "SOC1" in thermal_zone6
May 22 14:16:10 marvel-fov-8 jtop[472547]: [INFO] jtop.core.temperature - Found thermal "CV2" in thermal_zone4
May 22 14:16:10 marvel-fov-8 jtop[472547]: [INFO] jtop.core.power - Alarms VDD_IN - {'crit_alarm': 0, 'max_alarm': 0}
May 22 14:16:10 marvel-fov-8 jtop[472547]: [INFO] jtop.core.power - Alarms VDD_CPU_GPU_CV - {'crit_alarm': 0, 'max_alarm': 0}
May 22 14:16:10 marvel-fov-8 jtop[472547]: [INFO] jtop.core.power - Alarms VDD_SOC - {'crit_alarm': 0, 'max_alarm': 0}
May 22 14:16:10 marvel-fov-8 jtop[472547]: [WARNING] jtop.core.power - Skipped "sum of shunt voltages" /sys/bus/i2c/devices/1-0040/hwmon/hwmon3/in7_label
May 22 14:16:10 marvel-fov-8 jtop[472547]: [INFO] jtop.core.power - Found I2C power monitor
May 22 14:16:10 marvel-fov-8 jtop[472547]: [INFO] jtop.core.fan - Fan pwmfan(1) found in /sys/class/hwmon/hwmon2
May 22 14:16:10 marvel-fov-8 jtop[472547]: [INFO] jtop.core.fan - RPM pwm_tach found in /sys/class/hwmon/hwmon0
May 22 14:16:10 marvel-fov-8 jtop[472547]: [INFO] jtop.core.fan - Found nvfancontrol.service
May 22 14:16:10 marvel-fov-8 jtop[472547]: [INFO] jtop.core.jetson_clocks - jetson_clocks found in /usr/bin/jetson_clocks
May 22 14:16:10 marvel-fov-8 jtop[472547]: [INFO] jtop.core.nvpmodel - nvpmodel running in [0]15W - Default: 0
May 22 14:16:10 marvel-fov-8 jtop[472573]: [INFO] jtop.service - Initialization service
May 22 14:16:11 marvel-fov-8 jtop[472573]: Process JtopServer-1:
May 22 14:16:11 marvel-fov-8 jtop[472573]: Traceback (most recent call last):
May 22 14:16:11 marvel-fov-8 jtop[472573]: File "/usr/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap
May 22 14:16:11 marvel-fov-8 jtop[472573]: self.run()
May 22 14:16:11 marvel-fov-8 jtop[472573]: File "/usr/local/lib/python3.8/dist-packages/jtop/service.py", line 319, in run
May 22 14:16:11 marvel-fov-8 jtop[472573]: self.jetson_clocks.initialization(self.nvpmodel, data)
May 22 14:16:11 marvel-fov-8 jtop[472573]: File "/usr/local/lib/python3.8/dist-packages/jtop/core/jetson_clocks.py", line 370, in initialization
May 22 14:16:11 marvel-fov-8 jtop[472573]: self._engines_list = self.show()
May 22 14:16:11 marvel-fov-8 jtop[472573]: File "/usr/local/lib/python3.8/dist-packages/jtop/core/jetson_clocks.py", line 522, in show
May 22 14:16:11 marvel-fov-8 jtop[472573]: lines = cmd(timeout=COMMAND_TIMEOUT)
May 22 14:16:11 marvel-fov-8 jtop[472573]: File "/usr/local/lib/python3.8/dist-packages/jtop/core/command.py", line 115, in __call__
May 22 14:16:11 marvel-fov-8 jtop[472573]: raise Command.CommandException('Error process:', self.process.returncode)
May 22 14:16:11 marvel-fov-8 jtop[472573]: jtop.core.command.Command.CommandException: [errno:1] Error process:
May 22 14:16:11 marvel-fov-8 jtop[472547]: [INFO] jtop.service - Service closed
May 22 14:16:11 marvel-fov-8 systemd[1]: jtop.service: Succeeded.
With the most notable line above being:May 22 14:16:10 marvel-fov-8 jtop[472547]: [WARNING] jtop.core.gpu - No NVIDIA GPU available
This issue occurred after the Jetson Orin Nano suddenly rebooted while running an intensive python script. Notably other Orin Nano devices we have also suddenly rebooted, but none of them faced this issue where the GPU is not being detected.
Here is the device info: