Orin fails with message nvgpu: 17000000.ga10b ga10b_pbdma_handle_intr_0_legacy:437 [ERR] semaphore acquire timeout!

jhnlmn · June 27, 2023, 12:04am

Hi,
This error is rare, I had seen it few times over several months of testing. I happened twice in the last 2 days, both times at night.
It happened with OS 35.3.1 and earlier.
I found that in those few cases that I managed to obtain syslog from there is a correlation between message

systemd-timesyncd[376]: Initial synchronization to time server 104.171.113.34:123 (1.pool.ntp.org).
which is followed by
kernel: [   46.450236] nvgpu: 17000000.ga10b  ga10b_pbdma_handle_intr_0_legacy:437  [ERR]  semaphore acquire timeout!
kernel: [   46.460468] __ga10b__ Channel Status - chip ga10b
kernel: [   46.460470] __ga10b__ ---------------------------
kernel: [   46.465329] __ga10b__ 420-ga10b, TSG: 26, pid 2879, refs: 2, deterministic: no, domain name: (default)
kernel: [   46.470176] __ga10b__ channel status:  in use idle not busy
after that __ga10b__ errors are printed non-stop and Orin requires reboot.
However, message "Initial synchronization to time server" appears in good boots as well, but in bad cases it always precedes the __ga10b__ error.

What can it be? Is it possible that “semaphore acquire timeout” somehow causes by system time change?
I see that other people reported this error after system resume, but I never performed any suspend or resume, just reboot using “reboot” command.

Thank you

WayneWWW · June 27, 2023, 3:01am

HI,

We may need full error log. You could use serial console from micro usb port to capture log next time NVIDIA Jetson Orin - Serial Console - RidgeRun Developer Connection

And please try to see if you can figure out a method to reproduce issue.

jhnlmn · June 27, 2023, 7:03am

Reproducing it again with serial log connected will take time.
Meanwhile I attached syslog from the last event.
orin2_syslog_semaphore_acquire_timeout_2023_06_16_to_send.txt (7.2 MB)

kayccc · July 25, 2023, 7:14am

Any serial log and the reproduce steps can be provided?

jhnlmn · July 25, 2023, 6:21pm

I gave you syslog with the error, which appears to have all the information, the same as serial log (which I do not have yet).
May be you can point me to the place in the kernel sources, were this error comes from, so I can put more logs there?

Topic		Replies	Views
Orin keeps crashing for no reason Jetson AGX Orin boot	12	586	December 4, 2023
Orin system crash Jetson AGX Orin linux	15	1238	May 17, 2023
Orin has black screen after jetpack upgrade, fan 100% and "Reading board information failed" Jetson AGX Orin boot , reflash	7	877	October 25, 2023
Orin HDMI kernel panic Jetson AGX Orin board-design , hdmi	4	727	May 16, 2023
ORIN 開機後出現錯誤訊息 Jetson AGX Orin reboot	4	54	August 13, 2024
JetPack 5.1.2 Kernel Panic While Reboot on Orin Jetson AGX Orin boot	2	389	September 11, 2023
Orin 64G Reboot Jetson AGX Orin kernel , board-design , nvbugs	16	828	September 25, 2023
Orin vi5_channel_stop kernel panic Jetson AGX Orin camera , kernel	3	478	July 3, 2023
Kernel level issue Jetson Orin NX J4012 Jetson Orin NX kernel	8	157	October 8, 2024
Jetson Orin Not Detected SDK Manager and Unable to Reset Jetson AGX Orin sdkm	28	1441	December 19, 2023

Orin fails with message nvgpu: 17000000.ga10b ga10b_pbdma_handle_intr_0_legacy:437 [ERR] semaphore acquire timeout!

Related topics