JP6.0 Bring your own kernel: display driver version mismatch

Hello,

We are trying to get our own kernel flashed to our Jetson Orin NX using a Seeedstudio A603 carrier (similar to devkit, works out-of-the-box with 36.3.0).

Below our flashing routine:

export L4T_BASE_DIR=`pwd`/Linux_for_Tegra
export CROSS_COMPILE=`pwd`/aarch64--glibc--stable-2022.08-1/bin/aarch64-buildroot-linux-gnu-
export LOCALVERSION=-tegra
export KERNEL_SRC_PATH=$L4T_BASE_DIR/source/kernel/kernel-jammy-src

# Source sync to latest tag 
cd source
./source_sync.sh -t rel-36_eng_2024-07-23
rm -rf kernel/kernel-jammy-src

# Clone linux stable
git clone https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git kernel-jammy-src
git checkout linux-6.8.y

# Configure kernel
./scripts/config --file "arch/arm64/configs/defconfig" --enable CONFIG_ARM64_PMEM
./scripts/config --file "arch/arm64/configs/defconfig" --enable CONFIG_PCIEASPM_PERFORMANCE
./scripts/config --file "arch/arm64/configs/defconfig" --enable CONFIG_PCIE_TEGRA194
./scripts/config --file "arch/arm64/configs/defconfig" --enable CONFIG_PCIE_TEGRA194_HOST
./scripts/config --file "arch/arm64/configs/defconfig" --enable CONFIG_BLK_DEV_NVME
./scripts/config --file "arch/arm64/configs/defconfig" --enable CONFIG_NVME_CORE
./scripts/config --file "arch/arm64/configs/defconfig" --enable CONFIG_FB_SIMPLE
./scripts/config --file "arch/arm64/configs/defconfig" --enable CONFIG_CRYPTO_ECDSA // required for nvdisplay build

# Build kernel and OOT modules
make -C kernel
export KERNEL_HEADERS=$PWD/kernel/kernel-jammy-src
make modules

# Prepare rootfs
sudo tar -xvf tegra_linux_sample-root-filesystem_r36.3.0_aarch64.tbz2 -C $L4T_ROOTFS_DIR
sudo ./apply_binaries.sh

# Install kernel and modules & update initrd
cp kernel/kernel-jammy-src/arch/arm64/boot/Image $L4T_BASE_DIR/kernel/Image
sudo -E make install -C kernel
sudo -E make modules_install
sudo ./tools/l4t_update_initrd.sh

# Flash
sudo tools/l4t_create_default_user.sh -u nvidia -p nvidia -a -n nvidia-desktop --accept-license
sudo ./tools/kernel_flash/l4t_initrd_flash.sh --external-device nvme0n1p1  -c tools/kernel_flash/flash_l4t_t234_nvme.xml -p "-c bootloader/generic/cfg/flash_t234_qspi.xml"  --showlogs --network usb0 p3509-a02-p3767-0000 internal

Flash is successful, however the display does not work at boot.

Running the following through the UART console makes the display work:
sudo modprobe -r tegra-drm

When reloading the nvidia-modeset module dmesg shows:

[  535.951096] NVRM: API mismatch: the client has the version 540.3.0, but
               NVRM: this kernel module has the version 540.4.0.  Please
               NVRM: make sure that this kernel module and all NVIDIA driver
               NVRM: components have the same version.

Indeed in the sources, repository nv-kernel-display-driver at the tag synced is 540.4 since tag rel-36_eng_2024-04-04. However, all .deb packages (copied by apply_binaries.sh to rootfs) in the 36.3.0 BSP seem to have been build for 540.3 (for example libglxserver and Xorg related drivers).

Attached logs from dmesg and Xorg.

I have tested with multiple other kernel versions namely 6.11, 6.10 and 6.8.12. Same issue for all of these.

I have also tried falling back to rel-36_eng_2024-04-04 and have other issues and also this limits the kernel version that can be used.

Seems like I am missing something to get the GPU drivers and display working correctly. Any hints ?

Thank you
dmesg.txt (64.3 KB)
Xorg.0.log (6.8 KB)

Will it work if you use the version from website?

Yes, no problem with default 36.3 and modified 5.15 kernel. However, not working with other Kernel sources and version.

Could you share me your steps to download other version kernel display driver?

in source folder, after running ./source_sync.sh -t rel-36_eng_2024-07-23 in Linux_for_Tegra/source I have:

dtc-src              kernel                   nvdisplay     nv_src_build.sh
generic_rt_build.sh  kernel_src_build_env.sh  nvethernetrm  out
hardware             Makefile                 nvgpu         source_sync.sh
hwpm                 nvbuild.sh               nvidia-oot    tegra

nvdisplay being nv-tegra.nvidia.com/tegra/kernel-src/nv-kernel-display-driver.git which is 540.4 on tag rel-36_eng_2024-07-23.

nvidia-smi

Failed to initialize NVML: Driver/library version mismatch
NVML library version: 540.03

cat /proc/driver/nvidia/version

NVRM version: NVIDIA UNIX Open Kernel Module for aarch64  540.4.0  Release Build 
GCC version:  collect2: error: ld returned 1 exit status

Unloading tegra-drm module immidiately brings the ubuntu desktop to the screen.

modinfo tegra-drm:

filename:       /lib/modules/6.8.12-tegra-dirty/updates/drivers/gpu/drm/tegra/tegra-drm.ko
license:        GPL v2
description:    NVIDIA Tegra DRM driver
author:         Thierry Reding <thierry.reding@avionic-design.de>
softdep:        pre: tegra_wmark
import_ns:      DMA_BUF

After reloading module sudo modprobe tegra-drm the screen quits the GUI and goes to the non-interactive console.
most relevant section from log :
modprobe-tegra-drm.txt (38.9 KB)

[ 4532.780421] host1x drm: iommu configuration for device failed with -ENOENT
...
[ 4536.729750] drm drm: [drm] No compatible format found
[ 4536.733366] drm drm: [drm] Cannot find any crtc or sizes
[ 4536.733537] [drm] Initialized tegra 1.0.99 20120330 for drm on minor 0
... 

Seems like the first iommu error is not an issue, however no compatible format could be an issue I imagine.

Other logs attached:
lsmod.txt (4.8 KB)
Xorg.log (17.5 KB)
dmesg_after_modprobe.txt (66.8 KB)
dmesg_before_modprobe.txt (66.6 KB)

1 Like

Hi @WayneWWW, have you been able to reproduce or have some idea of what I could try ?

Thank you

Hi,

Checked with internal team. You can only wait for next BSP release to make this new source code.

Seems like it’s more an issue with some missing kernel config rather than a code issue.

Any planned date for next release ?