[ 35.105592] rcu: Tasks blocked on level-0 rcu_node (CPUs 0-7): P451/2:b..l

When we conducted a restart stress test on the Jetson-Orin-AGX board in conjunction with our baseboard, after 681 rounds of testing, the system experienced lag on the setup page, and the following log was output on the TCU serial terminal.

[2025-06-07 21:22:03] ubuntu login: [ 35.105581] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
[2025-06-07 21:23:06] [ 35.105592] rcu: Tasks blocked on level-0 rcu_node (CPUs 0-7): P451/2:b..l
[2025-06-07 21:23:07] [ 98.125581] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
[2025-06-07 21:24:09] [ 98.125593] rcu: Tasks blocked on level-0 rcu_node (CPUs 0-7): P451/2:b..l
[2025-06-07 21:24:09] [ 161.145581] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
[2025-06-07 21:25:12] [ 161.145593] rcu: Tasks blocked on level-0 rcu_node (CPUs 0-7): P451/2:b..l
[2025-06-07 21:25:12] [ 224.165580] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
[2025-06-07 21:26:16] [ 224.165592] rcu: Tasks blocked on level-0 rcu_node (CPUs 0-7): P451/2:b..l
[2025-06-07 21:26:16] [ 287.185580] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
[2025-06-07 21:27:19] [ 287.185591] rcu: Tasks blocked on level-0 rcu_node (CPUs 0-7): P451/2:b..l

I will also attach the restart logs for the last two times.
dump_tess.log (637.3 KB)

Could you please help us analyze what the reason is for this? Are there any specific directions for investigation?

Will this be able to reproduce on rel-36.4.3 without RT patch?

We did not conduct the test in the environment without the RT patch.
Our product requires a real-time environment. And we had to conduct 681 rounds of stress tests to reproduce this issue once. We are not sure if the problem can be reproduced through stress testing in the environment without the RT patch.
Can we theoretically tell under what circumstances this problem will occur?

We are checking this issue now.

Hello, has there been any progress on this issue?

please apply this patch to your kernel.

diff --git a/drivers/i2c/busses/i2c-tegra.c b/drivers/i2c/busses/i2c-tegra.c
index cd50436..ddf88c3 100644
--- a/drivers/i2c/busses/i2c-tegra.c
+++ b/drivers/i2c/busses/i2c-tegra.c
@@ -1938,9 +1938,13 @@
 	 * VI I2C device shouldn't be marked as IRQ-safe because VI I2C won't
 	 * be used for atomic transfers.
 	 */
-	if (!i2c_dev->is_vi)
+	if (!i2c_dev->is_vi && !IS_ENABLED(CONFIG_PREEMPT_RT)) {
 		pm_runtime_irq_safe(i2c_dev->dev);
-
+		dev_info(i2c_dev->dev, "I2C: pm_runtime_irq_safe set (non-RT kernel)\n");
+	}
+	else {
+		dev_info(i2c_dev->dev, "I2C: pm_runtime_irq_safe NOT set (RT kernel or VI I2C)\n");
+	}
 	pm_runtime_enable(i2c_dev->dev);
 
 	err = tegra_i2c_init_hardware(i2c_dev);

We have merged it and are verifying whether the issue has been resolved.

1 Like

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.