使用ORINAGX在Real-Time系统模式下,无法从NVME启动,基于Jetpack6.2

我们需要使用AGX控制我们的机器人,当前我们基于Jetpack6.2版本,由于需要比较高的实时性,我们尝试将其更换为real-time 实时系统,加快其反应时间,为此我们先后进行了如下操作:
1.参考官方文档使用apt安装的方式:
文档:Kernel Customization — NVIDIA Jetson Linux Developer Guide
方式:


该方式在EMMC安装时是没有问题的,但是在NVME安装时,会卡在如下界面:

2.我们又尝试了编译内核然后安装内核的方式
参考文档与上面相同:
主要有编译替换内核、树外模块、DTB三个内容
操作方式主要参考了官网的几个内容

现象为,当将编译好的系统下载到EMMC中时,系统正常启动,当将编译好的系统下载到NVME中时,系统黑屏无法启动。
我们实在是不清楚问题出在哪里,我们现在迫切想搞清楚两个问题:
1.更换real-time后无法从NVME启动的原因
2.切实可行的能从NVME以real-time启动的方法和步骤(比如您这边尝试过可以的方法)
以上无论如何,非常感谢!

Hi,
Please follow the steps to rebuild RT kernel, and flash AGX Orin developer kit through initrd command:

Quick Start — NVIDIA Jetson Linux Developer Guide

Jetson AGX Orin Developer Kit (NVMe):

$ sudo ./tools/kernel_flash/l4t_initrd_flash.sh --external-device nvme0n1p1 \
  -c tools/kernel_flash/flash_l4t_t234_nvme.xml \
  --showlogs --network usb0 jetson-agx-orin-devkit external

The link can’t be opened.

Hi,
The link was wrong and it is corrected. Please check again.

Hi DaneLLL,
I have tested building a real-time (RT) kernel for the Jetson AGX Orin Developer Kit with JetPack 6.2 (Jetson Linux 36.4.3). Below is a detailed description of my process and results.

1. Initial Setup and Full Flash Package:
I followed the Quick Start — NVIDIA Jetson Linux Developer Guide to create a complete flash package, which successfully flashed Jetson Linux onto the AGX Orin.

2. Source Code and Toolchain Preparation:
I downloaded the Driver Package (BSP) Sources and Bootlin Toolchain gcc 11.3 from Jetson Linux | NVIDIA Developer and placed them in my host directory: ~/nvidia/jetpack/BSP36.4.3JP6.2/.

  • Directory contents:

$ cd ./nvidia/jetpack/BSP36.4.3JP6.2/
$ ls
aarch64–glibc–stable-2022.08-1.tar.bz2 Linux_for_Tegra Tegra_Linux_Sample-Root-Filesystem_R36.4.3_aarch64.tbz2
Jetson_Linux_R36.4.3_aarch64.tbz2 public_sources.tbz2

3. Building the Linux Kernel:
I referenced the Kernel Customization — NVIDIA Jetson Linux Developer Guide to manually build the Linux kernel.

  • Below are the commands I used during the kernel build process:

$ sudo apt install git-core
$ sudo apt install build-essential bc
$ mkdir $HOME/l4t-gcc
$ cd ~/l4t-gcc/
$ tar xf ../nvidia/jetpack/BSP36.4.3JP6.2/aarch64–glibc–stable-2022.08-1.tar.bz2
$ export CROSS_COMPILE=$HOME/l4t-gcc/aarch64–glibc–stable-2022.08-1/bin/aarch64-buildroot-linux-gnu-
$ cd ~/nvidia/jetpack/BSP36.4.3JP6.2/
$ tar xf ./public_sources.tbz2 -C ./Linux_for_Tegra/..
$ cd ./Linux_for_Tegra/
$ cd source/
$ tar xf kernel_src.tbz2
$ tar xf kernel_oot_modules_src.tbz2
$ tar xf nvidia_kernel_display_driver_source.tbz2
$ ./generic_rt_build.sh “enable”
$ export CROSS_COMPILE=$HOME/l4t-gcc/aarch64–glibc–stable-2022.08-1/bin/aarch64-buildroot-linux-gnu-
$ make -C kernel
$ export INSTALL_MOD_PATH=~/nvidia/jetpack/BSP36.4.3JP6.2/Linux_for_Tegra/rootfs/
$ sudo -E make install -C kernel
$ cp kernel/kernel-jammy-src/arch/arm64/boot/Image ../../Linux_for_Tegra/kernel/Image
$ sudo ./apply_binaries.sh

4. Flashing with eMMc:

  • I flashed the AGX Orin to eMMC using the following command, which reported “flashed successfully”:

$ sudo ./flash.sh jetson-agx-orin-devkit internal

However, the device failed to boot. When I removed the NVMe drive and re-flashed, it booted successfully. After booting, I confirmed that the kernel was indeed an RT kernel (as shown in the attached image)


5. Flashing with NVME:

  • I then attempted to flash the Orin with an NVMe drive using the following command:

$ sudo ./tools/kernel_flash/l4t_initrd_flash.sh --external-device nvme0n1p1
-c tools/kernel_flash/flash_l4t_t234_nvme.xml --showlogs --network usb0
jetson-agx-orin-devkit external

  • The flash failed with the following error (only the last few lines of the flash log are included):

Formatting APP parition done
Formatting APP partition /dev/mmcblk0p1 …
tar --checkpoint-action="ttyout=Hit %s checkpoint #%u%\r" -x -I ‘zstd -T0’ -pf /mnt/internal/system.img --warning=no-timestamp --numeric-owner --xattrs --xattrs-include= -C /tmp/ci-9rIns394DH
writing item=78, 1:3:secondary_gpt, 61203267072, 16896, gpt_secondary_1_3.bin, 16896, fixed--0, 3852b3904c6c14c3e5baf995c8979246b95a80ef
[ 87]: l4t_flash_from_kernel: Successfully flash the emmc
writing item=17, 9:0:secondary_gpt, 61203267072, 16896, gpt_secondary_9_0.bin, 16896, fixed--0, 9f35eb6f3dd36c82c60803bb798f3f01f41a9c2a
[ 179]: l4t_flash_from_kernel: Successfully flash the external device
[ 179]: l4t_flash_from_kernel: The device size indicated in the partition layout xml is smaller than the actual size. This utility will try to fix the GPT.
[ 179]: l4t_flash_from_kernel: The device size indicated in the partition layout xml is smaller than the actual size. This utility will try to fix the GPT.
[ 179]: l4t_flash_from_kernel: Error flashing qspi
Flash failure
Either the device cannot mount the NFS server on the host or a flash command has failed. Check your network setting (VPN, firewall,…) to make sure the device can mount NFS server. Debug log saved to /tmp/tmp.IYd9L8GXl7. You can access the target’s terminal through “sshpass -p root ssh root@fc00:1:1:0::2”
Cleaning up…

I strictly followed the steps outlined in this section of the official documentation for my operations. I’m unsure if the documentation omitted certain steps that I might have overlooked. Could you please help me review my process and identify any potential issues? If you need specific logs or additional information, please let me know, and I will obtain and upload them. Thank you for your assistance!

This part is missing.

Hi WayneWWW,
Thank you for your reply. Are you saying that I need to follow the steps in the Building the NVIDIA Out-of-Tree Modules section and the subsequent sections as well? Even if I am using the Developer Kit? However, I’m a bit confused about why I was already able to use the RT Kernel on eMMC successfully?

Hi,

The logic here

  1. If you rebuild the kernel image, you need to update the out-of-tree modules too. For example, you won’t expect your kernel modules would still work if you change kernel image from k5.10 to k5.15 (just an example here , I didn’t mean you are doing that upgrade).

  2. In rel-36, pcie driver and nvme driver are out-of-tree modules. It fails to work because you changed RT kernel image.

  3. eMMC boot does not care about pcie or nvme.
    You could boot it up to eMMC but I think your PCIe would have issue there.

  4. It does not matter what carrier board you are using. It is a common sense thing to do if you are upgrading kernel.

Hi,
The NVIDIA oot modules have to be rebuilt along with upstream kernel. Pleaser rebuild it and check if the issue is still present.

@WayneWWW @DaneLLL
Thank you both for your assistance. Based on your suggestions, I continued to refer to the Building the NVIDIA Out-of-Tree Modules and Building the DTBs sections and performed the following operations:

$ cd ./nvidia/jetpack/BSP36.4.3JP6.2/
$ cd Linux_for_Tegra/source/
$ export IGNORE_PREEMPT_RT_PRESENCE=1
$ export CROSS_COMPILE=$HOME/l4t-gcc/aarch64–glibc–stable-2022.08-1/bin/aarch64-buildroot-linux-gnu-
$ export KERNEL_HEADERS=$PWD/kernel/kernel-jammy-src
$ make modules
$ export INSTALL_MOD_PATH=~/nvidia/jetpack/BSP36.4.3JP6.2/Linux_for_Tegra/rootfs/
$ sudo -E make modules_install
$ cd ..
$ sudo ./tools/l4t_update_initrd.sh
$ cd source/
$ export CROSS_COMPILE=$HOME/l4t-gcc/aarch64–glibc–stable-2022.08-1/bin/aarch64-buildroot-linux-gnu-
$ export KERNEL_HEADERS=$PWD/kernel/kernel-jammy-src
$ make dtbs
$ cp kernel-devicetree/generic-dts/dtbs/* ../../Linux_for_Tegra/kernel/dtb/
$ cd ..
$ sudo ./apply_binaries.sh

After that, I re-flashed the AGX Orin with NVMe using the same command as before. This time, it reported "Flash is successful,"but the device still failed to boot, getting stuck at the NVIDIA logo screen.

  • Flash log:

  • Orin screen output

Below are the commands and outputs from my host terminal. Could you please help me review them to identify any errors? Do I need to run $ sudo ./apply_binaries.sh before flashing?

Please check the device side UART log now. Flash has been done.

You should not run this again before flash. This could make your kernel update all in vain and back to default. You should only run it right after you download the package.

Since I’ve been continuously connected to the device-side UART log since this morning, there are now a lot of miscellaneous logs. Please wait a moment while I clean them up and retrieve a complete log again.

Perhaps $ sudo ./apply_binaries.sh does not affect my built kernel. I just tried removing the NVMe and flashing the eMMC again, and the result was a successful flash with the real-time kernel running after boot.

I re-flashed with NVMe, and the host side still reported “Flash is successful.” The UART log includes the logs from the flashing process as well as the system power-on logs after flashing is complete. The system attempts to reboot multiple times and ultimately gets stuck at a specific point in the log. The device’s DP display remains stuck at the NVIDIA logo screen.
UART.log (2.2 MB)

Issue is still same. PCIe driver is not up so NVMe is not able to run.

[ 24.716020] ERROR: mounting PARTUUID=0288cb51-cc9a-463e-9019-b7dc5e7a590e as /mnt fail…
[ 24.717768] ERROR: PARTUUID=0288cb51-cc9a-463e-9019-b7dc5e7a590e mount fail…
[ 24.719764] ttyTCU0: Press [ENTER] to start bash in 30 seconds…
[ 27.720479] ttyTCU0: Press [ENTER] to start bash in 27 seconds…
[ 30.721050] ttyTCU0: Press [ENTER] to start bash in 24 seconds…
[ 33.721611] ttyTCU0: Press [ENTER] to start bash in 21 seconds…

Just to clarify. When you use eMMC to boot, what is the status of your lsmod command?

Perhaps $ sudo ./apply_binaries.sh does not affect my built kernel. I just tried removing the NVMe and flashing the eMMC again, and the result was a successful flash with the real-time kernel running after boot.

It maybe not affect /boot/Image, but it could replace all the kernel modules so your

$ make modules
$ export INSTALL_MOD_PATH=~/nvidia/jetpack/BSP36.4.3JP6.2/Linux_for_Tegra/rootfs/
$ sudo -E make modules_install

are all in vain.

Before performing the steps in Building the NVIDIA Out-of-Tree Modules and Building the DTBs, my attempts to flash with NVMe consistently failed. However, now the flash is successful, so I believe my operations are taking effect. Nevertheless, I will redo the above steps again and update the final results. Before redoing them, I removed the NVMe and re-flashed the eMMC, and the current output of the lsmod command after booting is as follows.

nvidia@tegra-ubuntu:~$ lsmod
Module                  Size  Used by
nvidia_drm             94208  1
nvidia_modeset       1306624  5 nvidia_drm
lzo_rle                16384  64
lzo_compress           16384  1 lzo_rle
zram                   28672  16
zsmalloc               36864  1 zram
nvme_fabrics           24576  0
nvme_core             110592  1 nvme_fabrics
ramoops                28672  0
reed_solomon           20480  1 ramoops
joydev                 32768  0
algif_hash             24576  1
algif_skcipher         20480  1
bridge                270336  0
af_alg                 32768  6 algif_hash,algif_skcipher
bnep                   28672  2
stp                    20480  1 bridge
llc                    20480  2 bridge,stp
usb_f_ncm              28672  2
usb_f_mass_storage     53248  2
rtk_btusb              77824  0
btusb                  49152  0
btrtl                  24576  1 btusb
btintel                45056  1 btusb
btbcm                  28672  1 btusb
usb_f_acm              16384  2
u_serial               20480  3 usb_f_acm
usb_f_rndis            32768  2
u_ether                32768  2 usb_f_rndis,usb_f_ncm
libcomposite           65536  16 usb_f_rndis,usb_f_mass_storage,usb_f_acm,usb_f_ncm
tegra_pcie_dma_test    20480  0
tegra_pcie_edma        20480  1 tegra_pcie_dma_test
rtl8822ce            3379200  0
snd_soc_tegra186_asrc    40960  1
snd_soc_tegra210_admaif   131072  1
snd_soc_tegra210_ope    36864  1
snd_soc_tegra_pcm      16384  1 snd_soc_tegra210_admaif
snd_soc_tegra186_arad    28672  2 snd_soc_tegra186_asrc
snd_soc_tegra210_mixer    49152  1
snd_soc_tegra186_dspk    20480  2
snd_soc_tegra210_mvc    20480  2
snd_soc_tegra210_adx    32768  4
snd_soc_tegra210_afc    20480  6
snd_soc_tegra210_dmic    20480  4
snd_soc_tegra210_amx    36864  4
snd_soc_tegra210_sfc    61440  4
snd_soc_tegra210_i2s    28672  6
cfg80211              884736  1 rtl8822ce
snd_soc_tegra210_ahub  1273856  3 snd_soc_tegra210_ope,snd_soc_tegra210_sfc
tegra210_adma          24576  2
nvvrs_pseq_rtc         16384  0
spidev                 28672  0
ucsi_ccg               28672  0
snd_soc_tegra_machine_driver    16384  0
snd_hda_codec_hdmi     69632  1
typec_ucsi             36864  1 ucsi_ccg
snd_soc_tegra_utils    32768  2 snd_soc_tegra210_admaif,snd_soc_tegra_machine_driver
snd_soc_rt5640        126976  1
crct10dif_ce           20480  1
snd_soc_simple_card_utils    28672  1 snd_soc_tegra_utils
typec                  61440  1 typec_ucsi
tegra23x_perf_uncore    20480  0
tegra234_oc_event      16384  0
nvpmodel_clk_cap       16384  0
snd_soc_rl6231         16384  1 snd_soc_rt5640
nvethernet           1175552  0
tegra_cactmon_mc_all    16384  0
tegra_mce              28672  1 tegra23x_perf_uncore
mttcan                 69632  0
at24                   24576  0
thermal_trip_event     16384  0
nvpps                  32768  2 mttcan,nvethernet
snd_hda_tegra          16384  0
tegra_aconnect         16384  0
can_dev                40960  1 mttcan
tegra234_aon           57344  1
snd_hda_codec         139264  2 snd_hda_codec_hdmi,snd_hda_tegra
pwm_tegra_tachometer    16384  0
tegra_xudc             45056  0
snd_hda_core          102400  3 snd_hda_codec_hdmi,snd_hda_codec,snd_hda_tegra
phy_tegra194_p2u       16384  13
nvidia_vrs_pseq        16384  0
spi_tegra114           28672  0
i2c_nvvrs11            16384  0
nvidia               1589248  9 nvidia_modeset
lm90                   28672  0
mc_hwpm                16384  0
pwm_tegra              20480  1
pcie_tegra194          40960  0
host1x_fence           24576  0
tegra_dce             110592  2 nvidia
tegra_bpmp_thermal     16384  0
nvhost_isp5            16384  0
nvhost_nvcsi_t194      16384  0
nvhost_vi5             20480  0
tegra_camera          249856  3 nvhost_isp5,nvhost_nvcsi_t194,nvhost_vi5
v4l2_dv_timings        40960  1 tegra_camera
nvhost_nvcsi           24576  1 tegra_camera
tegra_camera_platform    24576  4 nvhost_isp5,nvhost_nvcsi_t194,tegra_camera,nvhost_vi5
capture_ivc            32768  1 tegra_camera
tegra_camera_rtcpu    229376  2 capture_ivc,tegra_camera
ivc_bus                24576  2 capture_ivc,tegra_camera_rtcpu
hsp_mailbox_client     20480  2 ivc_bus,tegra_camera_rtcpu
ivc_ext                20480  2 ivc_bus,capture_ivc
v4l2_fwnode            20480  1 tegra_camera
v4l2_async             24576  2 v4l2_fwnode,tegra_camera
videobuf2_dma_contig    24576  1 tegra_camera
videobuf2_memops       20480  1 videobuf2_dma_contig
governor_userspace     16384  0
videobuf2_v4l2         32768  1 tegra_camera
videobuf2_common       65536  4 videobuf2_dma_contig,videobuf2_v4l2,tegra_camera,videobuf2_memops
tegra_drm             307200  10
videodev              270336  4 v4l2_async,videobuf2_v4l2,tegra_camera,videobuf2_common
nvhost_pva            167936  0
nvhost_nvdla          110592  0
nvhost_capture         20480  2 nvhost_isp5,nvhost_vi5
mc                     61440  4 videodev,videobuf2_v4l2,tegra_camera,videobuf2_common
tegra_wmark            16384  0
nvhwpm                139264  4 mc_hwpm,tegra_drm,nvhost_nvdla,nvhost_pva
cec                    57344  1 tegra_drm
host1x_nvhost          40960  10 nvhost_isp5,nvhost_nvcsi_t194,nvidia,tegra_camera,nvhost_nvdla,nvhost_capture,nvhost_nvcsi,nvhost_pva,nvhost_vi5,nvidt
drm_kms_helper        303104  3 tegra_drm,nvidia_drm
tegra_se               61440  2
tsecriscv              36864  1 nvidia
crypto_engine          16384  1 tegra_se
nvidia_p2p             20480  0
ina3221                24576  0
pwm_fan                20480  0
nvgpu                2846720  19
governor_pod_scaling    45056  0
host1x                208896  9 host1x_nvhost,host1x_fence,tegra_se,nvgpu,tegra_drm,nvhost_nvdla,nvidia_drm,nvhost_pva,nvidia_modeset
mc_utils               16384  3 nvidia,nvgpu,tegra_camera_platform
nvmap                 237568  95 nvgpu
nvsciipc               24576  1 nvmap
drm                   634880  16 drm_kms_helper,nvidia,tegra_drm,nvidia_drm
fuse                  143360  1
ip_tables              32768  0
x_tables               49152  1 ip_tables
ipv6                  512000  205 bridge
nvidia@tegra-ubuntu:~$ uname -r
5.15.148-rt-tegra
nvidia@tegra-ubuntu:~$

Could you share me the full dmesg of your emmc boot up case?

The output information of $ sudo dmesg -T > ~/Desktop/dmesg.log is provided in the attachment for reference.
dmesg.log (76.3 KB)

Please compare the pcie-tegra194.ko in your emmc 's rootfs and initrd and see if it is equal to the one exists in your nvme’s rootfs and initrd.