Jetson AGX Orin getting stuck often

Machine gets stuck for ~30 seconds when using it.
I see this on dmesg during the event:

[ 390.632811] NVRM rpcRmApiControl_dce: NVRM_RPC_DCE: Failed RM ctrl call cmd:0x731341 result 0xffff:
[ 391.131464] NVRM rpcRmApiControl_dce: NVRM_RPC_DCE: Failed RM ctrl call cmd:0x731341 result 0xffff:
[57557.296941] NVRM rpcRmApiControl_dce: NVRM_RPC_DCE: Failed RM ctrl call cmd:0x731341 result 0xffff:
[57557.794675] NVRM rpcRmApiControl_dce: NVRM_RPC_DCE: Failed RM ctrl call cmd:0x731341 result 0xffff:

Jetpack version:

sudo apt-cache show nvidia-jetpack
Package: nvidia-jetpack
Version: 5.1.1-b56
Architecture: arm64
Maintainer: NVIDIA Corporation
Installed-Size: 194
Depends: nvidia-jetpack-runtime (= 5.1.1-b56), nvidia-jetpack-dev (= 5.1.1-b56)
Priority: standard
Section: metapackages
Filename: pool/main/n/nvidia-jetpack/nvidia-jetpack_5.1.1-b56_arm64.deb
Size: 29304
SHA256: 7b6c8c6cb16028dcd141144b6b0bbaa762616d0a47aafa3c3b720cb02b2c8430
SHA1: 387e4e47133c4235666176032af0f2ec86461dbb
MD5sum: 0a8692031bf35cc46f7a498e2937bda9
Description: NVIDIA Jetpack Meta Package
Description-md5: ad1462289bdbc54909ae109d1d32c0a8

Also note that the the dmesg event may not be related to the stuck event. I dont see this message every time the machine gets stuck.

Note that everything about the machine stops during these few seconds. Even pinging from a separate PC stops during this time.

I probably won’t be able to answer, but it is important to know if this is the AGX Orin developer’s kit from NVIDIA, or if it is a module and third party carrier board. The carrier board changes device tree requirements, and often odd behavior is from using the wrong device tree. You’d also want to include which L4T release (you can use “head -n 1 /etc/nv_tegra_release”). There is a possibility that if this is too old of a release (and Orin is new enough that most previous releases are too old even if relatively new), that there’ll just be a suggestion to flash the newer release and see if the problem still occurs.

Note that often serial console will still have logging output even when the rest of the machine is failing. Serial console has very few driver requirements, and so it might still output an error message when network and other parts are failing. The more interesting part is that if serial console does not output logging during that time, then this too is an important clue since a failing serial console requires a more severe error. If you could get a serial console full boot log I’m sure someone will find that useful. If you could get a serial console log before and during and after the error, then that would be a gold mine of information, so you might try to get a serial console log all the way from power on to past that error.

You could try to dump the serial console log when the stuck happened.

UART serial console is able to dump log when system goes down, but other methods may not.

@WayneWWW I connected a different PC to the AGX Orin, but I see no output on the 4 ports that are detected. Set the baud to 115200 without any luck. Is there a setting on the jetson to enable serial console ?
@linuxdev I’m using the nvidia devkit. The output of “head -n 1 /etc/nv_tegra_release ” is

# R35 (release), REVISION: 4.1, GCID: 33958178, BOARD: t186ref, EABI: aarch64, DATE: Tue Aug 1 19:57:35 UTC 2023

If you are sure your board is booting up into system but you don’t see any log, then it must be the setup problem or other issues.

Please be aware that you need to use the micro usb port on the devkit but not the type C port.

If still see nothing, please share your command and setup.

Yes. I’m using the devkits’ micro usb port.
Which command do you need ?

What command did you use on your host PC to run the console

Used Putty on windows and also an arduino IDE with serial monitor. CHecked all the 4 virctual comports that the usb shows

At what timing did you enable the putty? How is the status of your jetson at this moment?

115200 baud.
The jetson is booted up. Also tried to keep the serial monitor live while booting up the jetson

The timing means did you enable the console before system boot or you enable it after system already booted for a while.

Please try to use a ubuntu host to dump the serial console. Actually we seldom using Windows putty. So not sure if it can work or not.

It did work by connecting to ubuntu host !

FYI: I also upgraded the jetpack to the latest one. Not sure if thats the reason, but haven’t seen the issue till now.

Dumping the serial console log does not matter with jetpack version.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.