KevinFFF,
Thank you for the information.
Have you also tried using Jetson Power GUI to check the GPU status?
No, I have not tried using the Jetson Power GUI tool. I am running my AGX Orin 32GB dev kit in headless mode. I tried connecting to the device with SSH X-forwarding enabled, and then running the /usr/share/nvpmodel_indicator/nvpmodel_indicator.py
command. I get a segmentation fault:
(nvpmodel_indicator.py:3472): Gtk-CRITICAL **: 10:17:13.786: gtk_icon_theme_get_for_screen: assertion 'GDK_IS_SCREEN (screen)' failed
/usr/share/nvpmodel_indicator/./nvpmodel_indicator.py:237: Warning: invalid (NULL) pointer instance
indicator = appindicator.Indicator.new(INDICATOR_ID, ICON_DEFAULT,
/usr/share/nvpmodel_indicator/./nvpmodel_indicator.py:237: Warning: g_signal_connect_data: assertion 'G_TYPE_CHECK_INSTANCE (instance)' failed
indicator = appindicator.Indicator.new(INDICATOR_ID, ICON_DEFAULT,
(nvpmodel_indicator.py:3472): Gtk-CRITICAL **: 10:17:13.791: _gtk_style_provider_private_get_settings: assertion 'GTK_IS_STYLE_PROVIDER_PRIVATE (provider)' failed
(nvpmodel_indicator.py:3472): Gtk-CRITICAL **: 10:17:13.791: _gtk_style_provider_private_get_settings: assertion 'GTK_IS_STYLE_PROVIDER_PRIVATE (provider)' failed
(nvpmodel_indicator.py:3472): Gtk-CRITICAL **: 10:17:13.791: _gtk_style_provider_private_get_settings: assertion 'GTK_IS_STYLE_PROVIDER_PRIVATE (provider)' failed
[1] 3472 segmentation fault (core dumped) python3 ./nvpmodel_indicator.py
Next, I tried connecting a monitor, mouse, and keyboard to my AGX Orin 32GB devkit, but I get an error for the X server and I cannot launch the Gnome desktop/GUI. Only the console appears.
Trying to use the Jetson Power GUI tool is probably off topic.
You can also refer to Activating GPU Power Rails on AGX Orin without a GUI - #5 by KevinFFF for the similar topic as yours.
Interesting! Thank you for sharing this topic. Looking at that topic, this comment offers the following output from tegrastats
for L4T v35 as part of JetPack v5:
05-10-2024 11:54:03 RAM 3274/54718MB (lfb 10932x4MB) SWAP 0/27359MB (cached 0MB) CPU [2%@729,0%@729,0%@729,6%@729,0%@729,0%@729,0%@729,0%@729,0%@1497,0%@1497,0%@1497,13%@1497] EMC_FREQ 0%@2133 GR3D_FREQ 0%@[0,0] VIC_FREQ 921 APE 174 CV0@-256C CPU@52.656C Tboard@42C SOC2@49.125C Tdiode@42.5C SOC0@50.687C CV1@-256C GPU@-256C tj@52.562C SOC1@50.312C CV2@-256C VDD_GPU_SOC 2154mW/2154mW VDD_CPU_CV 718mW/718mW VIN_SYS_5V0 7862mW/7862mW NC 0mW/0mW VDDQ_VDD2_1V8AO 796mW/796mW NC 0mW/0mW
The GR3D_FREQ
is 0%@[0,0]
but the GPU@-256C
temperature is present. The -256C
indicates the temperature is not working but the GPU@
“tag” is nonetheless still present in the tegrastats
output. It is possible the GPU temperature is reporting -256C
for the user in the topic because it is a custom carrier board.
I started jtop
and ran an application that used the GPU. I saw the GPU usage reach 93% in jtop
and the GPU temperature changed from Offline
to 43.44C
. Then, I stopped the application that used the GPU. The GPU usage returned to 0.0%. Yet! The GPU temperature did not change back to Offline. It continued to report the correct GPU temperature.
I exited out of jtop
and ran sudo tegrastats
from the command line.
10-16-2024 10:35:20 RAM 5764/30697MB (lfb 6x4MB) SWAP 0/15348MB (cached 0MB) CPU [5%@729,6%@729,5%@729,7%@729,6%@729,5%@729,3%@729,5%@729,off,off,off,off] EMC_FREQ 0%@2133 GR3D_FREQ 0%@[305,0] NVENC off NVDEC off NVJPG off NVJPG1 off VIC off OFA off NVDLA0 off NVDLA1 off PVA0_FREQ off APE 174 cpu@48.187C soc2@43.875C soc0@45.531C gpu@44.25C tj@48.187C soc1@44.75C VDD_GPU_SOC 2785mW/2785mW VDD_CPU_CV 397mW/397mW VIN_SYS_5V0 4145mW/4145mW
The gpu@44.25C
“tag” and temperature appears despite the GR3D_FREQ 0%
after stopping the GPU application.
Thus, it appears the GPU temperature is offline, non-existent until the first use of the GPU. This is a behavior change for tegrastats
.
JetPack v5
The GPU temperature is offline until first GPU use and the gpu@-256C
temperature is used to indicate “offline”. I do not have a JetPack v5 device at the moment to confirm the JetPack v5 behavior, but this topic seems to confirm this behavior.
Unconfirmed if the GPU temperature continues to be “online” after first GPU use.
JetPack v6
The GPU temperature is offline until first GPU use, but “offline” is indicated by tegrastats
through not providing any gpu
“tagged” temperature. After first GPU use, the GPU temperature continues to be “online”. When the GPU usage returns to 0%, the GPU temperature works as expected after the first use.
Does this need for “first use” behavior have anything to do with the GR3D_FREQ 0%[305,0]
field being GR3D_FREQ 0%[0,0]
and changes to 305
after first use?
Thank you