Device reboots after using the RT-Kernel patch [JP4.6.3]

Hi all,

To solve some other issues, I need to update the Jetson software to Jetpack 4.6.3/L4T 32.7.3.
I applyed PREEMPT-RT patches (./scripts/rt-patch.sh apply-patches) and compiled kernel, then replaced boot file and installed the kernel modules…The process follows this document and works fine.To build the Jetson Linux Kernel
However, when the device boot, it reboots shortly after the NVIDIA logo appears, and the process repeats.

  1. To rule out the TX2 NX module damage, I used flash.sh to flash the device. I successfully flash the device and was able to access the OS as normal. TX2 NX works fine.
  2. I’m using TX2 NX module+Xavier NX DevKit carrier board+L4T 32.7.3, The boot log is as follows:
    boot-rt.log (64.6 KB)
Starting kernel ...

[    0.000000] Booting Linux on physical CPU 0x100
[    0.000000] Linux version 4.9.299-rt191-tegra (zzw@ubuntu) (gcc version 7.3.1 20180425 [linaro-7.3-2018.05 revision d29120a424ecfbc167ef90065c0eeb7f91977701] (Linaro GCC 7.3-2018.05) ) #1 SMP PREEMPT RT Thu Feb 9 14:54:40 PST 2023
[    0.000000] Boot CPU: AArch64 Processor [411fd073]
[    0.000000] OF: fdt:memory scan node memory@80000000, reg size 80,
[    0.000000] OF: fdt: - 80000000 ,  70000000
[    0.000000] OF: fdt: - f0200000 ,  85600000
[    0.000000] OF: fdt: - 175e00000 ,  200000
[    0.000000] OF: fdt: - 176600000 ,  200000
[    0.000000] OF: fdt: - 177000000 ,  200000
[    0.000000] earlycon: uart8250 at MMIO32 0x0000000003100000 (options '')
[    1.263936] imx219 9-0010: imx219_board_setup: error during i2c read probe (-121)
[    1.271485] imx219 9-0010: board setup failed
[    1.299468] imx219 10-0010: imx219_board_setup: error during i2c read probe (-121)
[    1.307085] imx219 10-0010: board setup failed
[    1.791588] cgroup: cgroup2: unknown option "nsdelegate"
[    4.681180] random: crng init done
[    4.684589] random: 7 urandom warning(s) missed due to ratelimiting
[    5.751059] using random self ethernet address
[    5.755644] using random host ethernet address
[    6.269149] using random self ethernet address
[    6.278650] using random host ethernet address
[    9.371960] CPU1: shutdown
[    9.603312] CPU2: shutdown
[   10.294825] device-mapper: table: 253:0: thin-pool: unknown target type
[   10.302603] device-mapper: ioctl: error adding target to table
[   12.905866] device-mapper: table: 253:0: thin-pool: unknown target type
[   12.912480] device-mapper: ioctl: error adding target to table
[   15.249990] device-mapper: table: 253:0: thin-pool: unknown target type
[   15.258469] device-mapper: ioctl: error adding target to table
Ubuntu 18.04.6 LTS tx2nx-desktop ttyS0

tx2nx-desktop login: [0000.138] I> Welcome to MB2(TBoot-BPMP)(version: 01.00.160913-t186-M-00.00-mobile-f97eef92)

The boot log seems like to show two or three cycles that the device reboots. I can’t find any ideas in the log to solve this problem. Please share some insights. Thank you to everyone who offered advice!

B&R
zzw

About “tx2nx-desktop”:
When I first booted the device, it worked fine at first. I configured the username and password. However, before completing the next few configurations, the device automatically restarted.

Hi,
Please try the prebuilt package:
https://docs.nvidia.com/jetson/archives/l4t-archived/l4t-3261/index.html#page/Tegra%20Linux%20Driver%20Package%20Development%20Guide/kernel_custom.html#wwpID0EXHA

Would like to confirm if prebuilt package works. Please help give it a try.

Hi DaneLLL,

Thank you for your reply first. I tested prebuilt packages from the following entry as you suggested.

deb https://repo.download.nvidia.com/jetson/rt-kernel r32.7 main

I can see"4.9.253-rt168-tegra"after a reboot. Real-time performance can be observed using cyclictest. And I’ll test it further.

It seems that using prebuilt packages is a good way. But I have a further question.

How do I replace kernel image(/boot/Image) with my own modified one?

As we know, I can build my own kernel and replace Kernel/Image + Kernel/dtb in Linux_for_Tegra. In this process, we use patch to add real time performance.

If I don’t need real-time, I can generate my own Image from the source and replace /boot/Image.

If I get real-time through prebuilt packages, where do I generate my own Image to work with the original environment?

Hi,
For information, do you observe the issue on TX2 developer kit? Or TX2 module + your custom carrier board?

Hi DaneLLL,

I’m using TX2 NX module+Xavier NX DevKit carrier board+L4T 32.7.3.

Because TX2 NX has no official developer kit, So I use it with the Xavier NX
DK board. That should be the official recommended usage.

Btw,In the normal L4T 32.7.3, What I see through uname-a is 4.9.299-tegra, after using the prebuilt packages, that is 4.9.253-rt168-tegra. So I’m not 100% sure it’s available yet.

My own compiled kernel, which would cause the device to reboot:

[    0.000000] Linux version 4.9.299-rt191-tegra (zzw@ubuntu) (gcc version 7.3.1 20180425 [linaro-7.3-2018.05 revision d29120a424ecfbc167ef90065c0eeb7f91977701] (Linaro GCC 7.3-2018.05) ) #1 SMP PREEMPT RT Thu Feb 9 14:54:40 PST 2023

It looks like it should be 4.9.299-rt191-tegra in the plan.

B&R

Hi DaneLLL,

I’m sorry I didn’t fully understand you just now.

I have Nano, Xavier NX, TX2 NX modules or developer kits. Sometimes I can also use the AGX Orin devkit.

You may have seen that I have a thread on the forum about TX2. Unfortunately it’s not my device, so I can’t provide information on testing on the TX2.

Btw, my friend tested TX2 NX on a custom board and also had reboot problems. We all follow the normal compile process.

B&R

Hi,
Please try the steps in Skipping oem-config section to create default user. This can skip creating in first boot. See if this can boot to Ubuntu desktop.

Hi DaneLLL,

I will try it. Thank you for your concern about this thread.

B&R

Hi DaneLLL,

I created a default user named “zzww” with l4t_create_default_user.sh, and set automatic login.

According to the information provided by debug uart, it was able to login on for an extremely brief period of time before the device reboot. From the monitor, I can’t see the ubuntu desktop.

First login until just before reboot, there are a few lines like this:

zzww@tx2nx:~$ [   41.226794] INFO: rcu_preempt detected stalls on CPUs/tasks:
[   41.226800] 5-...: (0 ticks this GP) idle=eb7/140000000000000/0 softirq=0/0 fqs=2 
[   41.226804] (detected by 4, t=21002 jiffies, g=4670, c=4669, q=437)
[   41.226833] rcu_preempt kthread starved for 20994 jiffies! g4670 c4669 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x0

It has not been repeated. I’m not sure it’s relevant to the issue.

boot-default.log (77.0 KB)

Hi,
This looks to be an issue specific to using Jetpack 4.6.3. Could you try Jetpack 4.6?

Hi DaneLLL,

Jetpack 4.6/L4T 32.6.1 is the version we used before, there are occasional issues with Ethernet chips in this release, the update to Jetpack 4.6.3/L4T 32.7.3 resolves these issues.

Therefore, the new version of the software is very important to us. We would prefer to have this fixed if possible. Thanks a lot.

B&R

Hi all,

Waiting for a potential solution…

Anyone who sees this thread and has similar problems is welcome to join the discussion.

In our work, L4T 32.7.3 has proved itself more applicable than L4T 32.6.1. We need to build more on it, with rt kernel, before the next release update.

If I make a mistake and you have the right way, please let me know. Thank you!

Hi,
We are checking the issue. Will update once there is further finding.

1 Like

Hi,
I wonder if there is some information worth sharing on this issue?

@Jackdaws_3 we have a similar issue with the kernel kernel locking and the device rebooting. See JetPack 4.6.3: dmesg hangs the kernel and the device reboots

The preempt-rt pre-built kernel package you used 4.9.253 is from the previous JetPack 4.6
4.9.299-tegra is for JetPack 4.6.3.

I have tried to build an image on top of JetPack 4.6.3 to get the support for the new hynix memory for TX2, and downgrade the kernel to 4.9.253. I’m able to boot, all the peripherals are detected properly, but when trying to access CUDA I get these errors:

[ 22.372048] gk20a 17000000.gp10b: Direct firmware load for gp10b/acr_ucode.bin failed with error -2
[ 22.381142] nvgpu: 17000000.gp10b gm20b_bootstrap_hs_acr:1279 [ERR] acr_ucode.bin ucode get fail for gp10b
[ 22.391893] nvgpu: 17000000.gp10b gk20a_finalize_poweron:328 [ERR] ACR bootstrap failed

Where you able to use CUDA with 4.9.253-rt168-tegra?

@DaneLLL should this setup be working?

Hi,

@Jackdaws_3
please refer to this post for a temporary solution.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.