Are priority changes ever not allowed In Linux kernel (For Jetson)?

I’m sorry if I may be wrong to ask here but…

I am customising the linux kernel (For Jetson).

In normal Linux, the BH thread priority is set to a constant priority (MAX_USER_RT_PRIO / 2) [1], but I am changing the BH thread priority from a constant priority (MAX_USER_RT_PRIO / 2) to another value in the interrupt handler [2]. Specifically, this priority is changed in __irq_wake_thread().

#code1

action->thread->prio = prio_input

or
#code2

sched_setscheduler_nocheck(action->thread, SCHED_FIFO, & param_input);

But with such a customised kernel, the hardware crashes and reboots two seconds later.

I debugged it and identified #code1, #code2 as the cause, but I don’t understand what’s wrong.

it simply downed without any pop-ups like error messages. dmesg has no clues.

output of command “last”

reboot system boot 4.9.201-rt134 Sun Jul 10 11:24 still running 
him :1 :1 Sun Jul 10 11:21 - crash (00:02) 
reboot system boot 4.9.201-rt134 Fri Dec 31 20:00 still running 

Are priority changes ever not allowed in Interrupt handler?

If you know anything at all, please give us hints;;

Environment
Hardware: Jetson nano
Power mode: 10 W
Power: 5V4A
Kernel: linux kernel-4.9
How to build: Applying a PREEMPT-RT patch to JetPack 4.5 on Jetson Nano - #4 by ajcalderont

[1] In setup_irq_thread()
[2] In __irq_wake_thread()

Thank you in advance.

I haven’t worked with the real-time patch, and I can’t answer. However, you’ll probably need to show the actual block of code you’ve changed, and get a serial console log (monitor “dmesg --follow” from a remote computer running serial console with logging) which would provide more messages than just the “last” command shows.

Thanks for your reply.
I tried it.
But I could not find the cause…

[ 1219.445624] FAT-fs (sda1): Volume was not properly unmounted. Some data may be corrupt. Please run fsck.
[0000.159] [L4T TegraBoot] (version 00.00.2018.01-l4t-e82258de)
[0000.164] Processing in cold boot mode Bootloader 2
[0000.169] A02 Bootrom Patch rev = 1023
[0000.173] Power-up reason: software reset
[0000.176] No Battery Present
[0000.179] pmic max77620 reset reason
[0000.182] pmic max77620 NVERC : 0x0
[0000.186] RamCode = 0
[0000.188] Platform has DDR4 type RAM
[0000.191] max77620 disabling SD1 Remote Sense
[0000.196] Setting DDR voltage to 1125mv
[0000.199] Serial Number of Pmic Max77663: 0x130228
[0000.207] Entering ramdump check
[0000.210] Get RamDumpCarveOut = 0x0
[0000.213] RamDumpCarveOut=0x0,  RamDumperFlag=0xe59ff3f8
[0000.219] Last reboot was clean, booting normally!
[0000.223] Sdram initialization is successful 
[0000.227] SecureOs Carveout Base=0x00000000ff800000 Size=0x00800000
[0000.233] Lp0 Carveout Base=0x00000000ff780000 Size=0x00001000
[0000.239] BpmpFw Carveout Base=0x00000000ff700000 Size=0x00080000
[0000.245] GSC1 Carveout Base=0x00000000ff600000 Size=0x00100000
[0000.251] GSC2 Carveout Base=0x00000000ff500000 Size=0x00100000
[0000.257] GSC4 Carveout Base=0x00000000ff400000 Size=0x00100000
[0000.263] GSC5 Carveout Base=0x00000000ff300000 Size=0x00100000
[0000.268] GSC3 Carveout Base=0x000000017f300000 Size=0x00d00000
[0000.285] RamDump Carveout Base=0x00000000ff280000 Size=0x00080000
[0000.291] Platform-DebugCarveout: 0
[0000.294] Nck Carveout Base=0x00000000ff080000 Size=0x00200000
[0000.300] Non secure mode, and RB not enabled.
[0000.304] BoardID = 3448, SKU = 0x0
[0000.307] QSPI-ONLY: SkipQspiOnlyFlag = 0
[0000.311] Nano-SD: checking PT table on QSPI ...
[0000.315] Read PT from (2:0)
[0000.331] Using BFS PT to query partitions 
[0000.336] Loading Tboot-CPU binary
[0000.365] Verifying TBC in OdmNonSecureSBK mode
[0000.375] Bootloader load address is 0xa0000000, entry address is 0xa0000258
[0000.382] Bootloader downloaded successfully.
[0000.386] Downloaded Tboot-CPU binary to 0xa0000258
[0000.391] MAX77620_GPIO5 configured
[0000.394] CPU power rail is up
[0000.397] CPU clock enabled
[0000.401] Performing RAM repair
[0000.404] Updating A64 Warmreset Address to 0xa00002e9
[0000.409] BoardID = 3448, SKU = 0x0
[0000.412] QSPI-ONLY: SkipQspiOnlyFlag = 0
[0000.416] Nano-SD: checking PT table on QSPI ...
[0000.420] Loading NvTbootBootloaderDTB
[0000.487] Verifying NvTbootBootloaderDTB in OdmNonSecureSBK mode
[0000.560] Bootloader DTB Load Address: 0x83000000
[0000.564] BoardID = 3448, SKU = 0x0
[0000.568] QSPI-ONLY: SkipQspiOnlyFlag = 0
[0000.572] Nano-SD: checking PT table on QSPI ...
[0000.576] Loading NvTbootKernelDTB
[0000.642] Verifying NvTbootKernelDTB in OdmNonSecureSBK mode
[0000.715] Kernel DTB Load Address: 0x83100000
[0000.719] BoardID = 3448, SKU = 0x0
[0000.723] QSPI-ONLY: SkipQspiOnlyFlag = 0
[0000.726] Nano-SD: checking PT table on QSPI ...
[0000.733] Loading cboot binary
[0000.848] Verifying EBT in OdmNonSecureSBK mode
[0000.890] Bootloader load address is 0x92c00000, entry address is 0x92c00258
[0000.897] Bootloader downloaded successfully.
[0000.901] BoardID = 3448, SKU = 0x0
[0000.904] QSPI-ONLY: SkipQspiOnlyFlag = 0
[0000.908] Nano-SD: checking PT table on QSPI ...
[0000.913] PT: Partition NCT NOT found ! 
[0000.917] Warning: Find Partition via PT Failed
[0000.921] Next binary entry address: 0x92c00258 
[0000.925] BoardId: 3448
[0000.930] Overriding pmu board id with proc board id
[0000.935] Display board id is not available 
[0000.939] BoardID = 3448, SKU = 0x0
[0000.942] QSPI-ONLY: SkipQspiOnlyFlag = 0
[0000.946] Nano-SD: checking PT table on QSPI ...
[0001.051] Verifying SC7EntryFw in OdmNonSecureSBK mode
[0001.108] /bpmp deleted
[0001.110] SC7EntryFw header found loaded at 0xff700000
[0001.305] OVR2 PMIC
[0001.307] Bpmp FW successfully loaded
[0001.311] BoardID = 3448, SKU = 0x0
[0001.314] QSPI-ONLY: SkipQspiOnlyFlag = 0
[0001.318] Nano-SD: checking PT table on QSPI ...
[0001.323] WB0 init successfully at 0xff780000
[0001.327] Set NvDecSticky Bits
[0001.331] GSC2 address ff53fffc value c0edbbcc
[0001.337] GSC MC Settings done
[0001.340] BoardID = 3448, SKU = 0x0
[0001.343] QSPI-ONLY: SkipQspiOnlyFlag = 0
[0001.347] Nano-SD: checking PT table on QSPI ...
[0001.353] TOS Image length 53680
[0001.356]  Monitor size 53680
[0001.359]  OS size 0
[0001.374] Secure Os AES-CMAC Verification Success!
[0001.378] TOS image cipher info: plaintext
[0001.382] Loading and Validation of Secure OS Successful
[0001.398] SC7 Entry Firmware - 0xff700000, 0x4000
[0001.403] NvTbootPackSdramParams: start. 
[0001.408] NvTbootPackSdramParams: done. 
[0001.412] Tegraboot started after 86968 us
[0001.416] Basic modules init took 886996 us
[0001.420] Sec Bootdevice Read Time = 12 ms, Read Size = 61 KB
[0001.425] Sec Bootdevice Write Time = 0 ms, Write Size = 0 KB
[0001.431] Next stage binary read took 102860 us
[0001.435] Carveout took -126355 us
[0001.438] CPU initialization took 495400 us
[0001.442] Total time taken by TegraBoot 1358901 us

[0001.447] Starting CPU & Halting co-processor 

64NOTICE:  BL31: v1.3(release):5b49e7f80
NOTICE:  BL31: Built : 11:38:25, Jan 25 2021
ERROR:   Error initializing runtime service trusty_fast
[0001.569] RamCode = 0
[0001.574] LPDDR4 Training: Read DT: Number of tables = 2
[0001.579] EMC Training (SRC-freq: 204000; DST-freq: 1600000)
[0001.592] EMC Training Successful
[0001.595] 408000 not found in DVFS table
[0001.601] RamCode = 0
[0001.605] DT Write: emc-table@204000 succeeded
[0001.610] DT Write: emc-table@1600000 succeeded
[0001.614] LPDDR4 Training: Write DT: Number of tables = 2
[0001.662] 
[0001.663] Debug Init done
[0001.666] Marked DTB cacheable
[0001.668] Bootloader DTB loaded at 0x83000000
[0001.673] Marked DTB cacheable
[0001.676] Kernel DTB loaded at 0x83100000
[0001.680] DeviceTree Init done
[0001.693] Pinmux applied successfully
[0001.697] gicd_base: 0x50041000
[0001.701] gicc_base: 0x50042000
[0001.704] Interrupts Init done
[0001.708] Using base:0x60005090 & irq:208 for tick-timer
[0001.713] Using base:0x60005098 for delay-timer
[0001.718] platform_init_timer: DONE
[0001.721] Timer(tick) Init done
[0001.725] osc freq = 38400 khz
[0001.729] 
[0001.730] Welcome to L4T Cboot











Hi Thanks Tsum.

The logs still do not have information from kernel.
Is it okay to share the snippet? That will help us debug.

Thanks & Regards,
Sandipan

Thanks for replying.

That is this?

It did not crash at [ 1219.445624].
When #code2 is executed in an interrupt context after the OS has been booted and stabilised, it crashes. The Jetson-nano outputs then as
the attached file.

reboot_all.log (21.9 KB)

Starting kernel ...

[    0.000000] Booting Linux on physical CPU 0x0
[    0.000000] Linux version 4.9.201-rt134 (root@aaa-desktop) (gcc version 7.5.0 (Ubuntu/Linaro 7.5.0-3ubuntu1~18.04) ) #3 SMP PREEMPT RT Fri Jul 15 15:45:08 JST 2022
[    0.000000] Boot CPU: AArch64 Processor [411fd071]
[    0.000000] OF: fdt:memory scan node memory@80000000, reg size 32,
[    0.000000] OF: fdt: - 80000000 ,  7ee00000
[    0.000000] OF: fdt: - 100000000 ,  7f200000
[    0.000000] Found tegra_fbmem: 00800000@92ca9000
[    0.000000] earlycon: uart8250 at MMIO32 0x0000000070006000 (options '')
[    1.052342] tegradc tegradc.1: dpd enable lookup fail:-19
[    1.208581] imx219 7-0010: imx219_board_setup: error during i2c read probe (-121)
[    1.208656] imx219 7-0010: board setup failed
[    1.232572] imx219 8-0010: imx219_board_setup: error during i2c read probe (-121)
[    1.232638] imx219 8-0010: board setup failed
[    2.171374] sdhci: =========== REGISTER DUMP (mmc0)===========
[    2.171376] sdhci: Sys addr: 0x00000000 | Version:  0x00000303
[    2.171378] sdhci: Blk size: 0x00007040 | Blk cnt:  0x00000000
[    2.171379] sdhci: Argument: 0x00000000 | Trn mode: 0x00000010
[    2.171380] sdhci: Present:  0x01fb0206 | Host ctl: 0x00000016
[    2.171382] sdhci: Power:    0x00000001 | Blk gap:  0x00000000
[    2.171383] sdhci: Wake-up:  0x00000000 | Clock:    0x00000007
[    2.171385] sdhci: Timeout:  0x0000000e | Int stat: 0x00000000
[    2.171386] sdhci: Int enab: 0x00000020 | Sig enab: 0x00000020
[    2.171387] sdhci: AC12 err: 0x00000000 | Slot int: 0x00000000
[    2.171389] sdhci: Caps:     0x376cd08c | Caps_1:   0x10006f73
[    2.171390] sdhci: Cmd:      0x0000133a | Max curr: 0x00000000
[    2.171391] sdhci: Host ctl2: 0x0000304b
[    2.171393] sdhci: ADMA Err: 0x00000000 | ADMA Ptr: 0x00000000ffefe410
[    2.171394] sdhci: ===========================================
[    2.171395] mmc0: tuning execution failed: -5
[    2.171396] mmc0: error -5 whilst initialising SD card
[    4.781111] cgroup: cgroup2: unknown option "nsdelegate"
[   12.045950] using random self ethernet address
[   12.046377] using random host ethernet address
[   12.956746] using random self ethernet address
[   12.957394] using random host ethernet address

Ubuntu 18.04.5 LTS aaa-desktop ttyS0

aaa-desktop login: 

Is mmc0 unrelated? I see:

[    2.171395] mmc0: tuning execution failed: -5
[    2.171396] mmc0: error -5 whilst initialising SD card

Without seeing actual code involved, assuming it isn’t mmc0 access itself being modified, I have to think that modifying a section of code in an unrelated interrupt handler may have violated some sort of access or timing requirement. The BH itself would normally be created (and scheduled for later execution) by an uninterruptible IRQ section, but the BH itself would be something not timing critical. If the part of the code has some sort of synchronization issue, and either isn’t really qualified to be a bottom half, or else is BH but has the possibility of something like a priority inversion, perhaps it could harm either this interrupt handler or a seemingly unrelated IRQ handler. If mmc0 is unrelated to this (meaning not even related to your modification), then there is still something seriously wrong occurring (mmc0 is kind of critical and training during boot should eliminate that error).

You probably will need to post the full code for anyone to know what is going on, but if it is a timing issue, then it might still be very difficult to find even if the code involved really is BH.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.