Bug in printk.c code when rt-patches/* are applied

Hello,

As part of our testing of various RedHawk kernel flavors, we see that the debug kernel produces below backtrace earlier in the boot sequence.

[    0.210261] DEBUG_LOCKS_WARN_ON(!p->hardirqs_enabled)
[    0.210297] ------------[ cut here ]------------
[    0.210337] WARNING: CPU: 0 PID: 0 at kernel/fork.c:1627 copy_process.isra.5.part.6+0x704/0x1940
[    0.210389] Modules linked in:
[    0.210412] 
[    0.210428] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.9.201-rt134-r32.5.1-tegra-RedHawk-7.5.4-rt-debug #21
[    0.210485] Hardware name: quill (DT)
[    0.210509] task: ffffff800a154f00 task.stack: ffffff800a140000
[    0.210546] PC is at copy_process.isra.5.part.6+0x704/0x1940
[    0.210580] LR is at copy_process.isra.5.part.6+0x704/0x1940
[    0.210614] pc : [<ffffff80080ab0f4>] lr : [<ffffff80080ab0f4>] pstate: 60400045
[    0.210657] sp : ffffff800a143d70
[    0.210679] x29: ffffff800a143d70 x28: 0000000000800300 
[    0.210716] x27: ffffff800b7f5000 x26: 0000000000000000 
[    0.210754] x25: ffffff80090e6110 x24: 0000000000000000 
[    0.210790] x23: 0000000000000000 x22: ffffffc1ec698000 
[    0.210826] x21: 0000000000000000 x20: ffffff80080ac4cc 
[    0.210862] x19: ffffff800a149000 x18: ffffffffffffffff 
[    0.210898] x17: 0000000000000010 x16: 0000000000000000 
[    0.210933] x15: ffffff800a149fd0 x14: ffffff808b7f5b9f 
[    0.210974] x13: ffffff800b7f5bad x12: 0000000000000000 
[    0.211012] x11: 0000000005f5e0ff x10: 0000000000000069 
[    0.211051] x9 : 00000000ffffffd0 x8 : 64656c62616e655f 
[    0.211090] x7 : 000000000000006d x6 : ffffff80081325a4 
[    0.211129] x5 : 0000000000000000 x4 : 0000000000000000 
[    0.211168] x3 : 0000000000000000 x2 : ffffffffffffffff 
[    0.211206] x1 : ffffff800a14a0b0 x0 : 0000000000000029 
[    0.211245] 
[    0.211263] ---[ end trace 7c274a1c59fae664 ]---
[    0.211295] Call trace:
[    0.211318] [<ffffff80080ab0f4>] copy_process.isra.5.part.6+0x704/0x1940
[    0.211362] [<ffffff80080ac4cc>] _do_fork+0xd4/0x5c0
[    0.211397] [<ffffff80080aca60>] kernel_thread+0x48/0x90
[    0.211436] [<ffffff80090e60ac>] rest_init+0x2c/0x90
[    0.211474] [<ffffff8009870c40>] start_kernel+0x40c/0x424
[    0.211510] [<ffffff8009870204>] __primary_switched+0x80/0x94

This is occurring because there’s a missing lockdep_on() call in kernel/printk/printk.c file after the application of the rt-patches/* . Possible fix to the rt-patches/0246-printk-Make-rt-aware.patch is as below:

Index: a/kernel-4.9/kernel/printk/printk.c
===================================================================
--- a.orig/kernel-4.9/kernel/printk/printk.c
+++ a/kernel-4.9/kernel/printk/printk.c
@@ -1939,6 +1939,7 @@ asmlinkage int vprintk_emit(int facility
 		 */
 		if (may_trylock && console_trylock())
 			console_unlock();
+		lockdep_on();
 	}
 
 	return printed_len;

Please note that the above backtrace happens on non-PRT kernels as well if the rt-patches/* are applied but CONFIG_PREEMPT_RT_FULL is not set. Also, this is happening on other jetson boards as well and not just TX2. I found it on a TX2 first hence reporting here.

Thanks for your report will check it.

Thankyou for reporting. Please remove the misplaced lockdep_off() call instead.
File: rt-patches/0245-printk-Make-rt-aware.patch
if (!in_sched) {

  •          int may_trylock = 1;
    

-+ lockdep_off();

1 Like

Okay. Thank you for the comment.