The kernel panic occur during suspend/resume “pc : nvInitFlipEvoHwState+0x30/0xf8 [nvidia_modeset]”.
wake-up by LTE: Suspend_Panic_lte.txt (70.8 KB)
wake-up by USB: Suspend_Panic_mouse.txt (91.2 KB)
1.We are using customer board with l4t-r35.4.1
This issue only happened in HDMI device, the Devkit(DP) can’t reproduce.
2.We have a panic log similar to the one discussed in this topic:
3.I found a easy way to reproduce this panic:
a. sudo systemctl suspend
b. wait for 3-4s
c. wake-up system
I write a script to reproduce this issue.
test_period=4
for i in {1..500}
do
echo "Iteration: $i"
sudo /usr/sbin/rtcwake -m no -s $test_period
sudo systemctl suspend
sleep $test_period
done
Setting “test_period=30” and test for 1000 times, the system didn’t panic.
Setting “test_period=4” and test, the panic rate: 5/25.
I suspect that a wake-up triggered at some point(3,4s after suspend) during the suspend process is causing this panic to occur.
Whether it is an LTE, USB, or RTC wake-up, this issue arises.
I modify the test script, but the panic still occur.
test_period=4
for i in {1..500}
do
echo "Iteration: $i"
sudo /usr/sbin/rtcwake -m no -s $test_period
sudo busybox devmem 0x2212000 w 0x0D
sudo systemctl suspend
sleep $test_period
sudo busybox devmem 0x2212000 w 0x4D
done
About the post, the panic occur in: “pc : tegra186_gpio_irq+0x1a0/0x1e0”
But in our case, the panic occur in: “pc : nvInitFlipEvoHwState+0x30/0xf8 [nvidia_modeset]”
I don’t think the root cause of these two issues is the same.
I noticed that before panic, nvidia has some error log, so I print function “nvRmApiControl()” retrun value.
While the panic occur, the value is 26, waht does it means?
The issue cannot be reproduced in version R36, but it does not seem to be related to the Nvidia display driver.
Previous experiments indicate that this issue only occurs when the suspend process is interrupted at a specific timing.
In version R36, the system cannot be awakened using USB, RTC, power key, or other methods before the suspend process is complete.
What I mean is that the reason R36 does not encounter the nvidia_modeset panic is not because of any changes made to the Nvidia display driver. Instead, during my testing on version R36, I performed the following steps:
sudo systemctl suspend
Wait for 4 seconds
Press the power key to wake up the system
Result: The system didn’t wake up.
From this test result, it appears that in R36, the system cannot be awakened using the power key before the suspend process is complete.
Therefore, the “nvidia_modeset panic that occurs when waking up the system after 4 seconds of suspend” cannot be reproduced in R36.
We only have environments for R35 and R36.2, and we have confirmed using R35.4.1 to develop our product.
Creating a new environment for R36.3 and porting all our modifications to R36.3 would take a lot of time, which is not very feasible.
I want to confirm that since the Nvidia display driver for R35 will no longer be updated, but this version of the driver has a risk of panic.
So, from Nvidia’s perspective, is suspend not supported in the R35 with HDMI version?
Actually you shouldn’t use a developer preview for any error check for now. Which means your 36.2 does not really matter to current situation.
It is just rel-35 is already in maintenance mode. New error found on this may not get fixed. Better upgrading to rel-36. We may only check new issues on rel-36.
I checked release note of R36.2 and R36.3.
4185596
Both R36.2 and R36.3
Jetson AGX Orin Developer Kit and Jetson AGX Industrial modules could
intermittently fail to resume after suspend.
R36.3
Waking up from Deep Sleep state (SC7) by USB events is not supported in the
NVIDIA JetPack 6.0 GA release for the Jetson Orin Series of products. This
functionality will be added in a future release.
It seems " disable USB wake" is the workaround of “resume failed” .
Would you mind tell me how to disabled USB wake feature in r35.4.1?
R36.2開始的4185596, fail to resume after suspend
在R36.3的時候,同一個4185596停止support USB 喚醒功能
因為是同一條4185596,所以我推測"停止support USB 喚醒功能" 是 “fail to resume after suspend” 的work around.
所以我想知道r36.3是怎麼做到"停止support USB 喚醒功能", 我想將它套用到r35.4.1做測試