Inaccuracies in steps to build 5.15 kernel module

Please provide the following info (tick the boxes after creating this topic):
Software Version
DRIVE OS 6.0.8.1
DRIVE OS 6.0.6
DRIVE OS 6.0.5
DRIVE OS 6.0.4 (rev. 1)
DRIVE OS 6.0.4 SDK
other

Target Operating System
Linux
QNX
other

Hardware Platform
DRIVE AGX Orin Developer Kit (940-63710-0010-300)
DRIVE AGX Orin Developer Kit (940-63710-0010-200)
DRIVE AGX Orin Developer Kit (940-63710-0010-100)
DRIVE AGX Orin Developer Kit (940-63710-0010-D00)
DRIVE AGX Orin Developer Kit (940-63710-0010-C00)
DRIVE AGX Orin Developer Kit (not sure its number)
other

SDK Manager Version
1.9.3.10904
other

Host Machine Version
native Ubuntu Linux 20.04 Host installed with SDK Manager
native Ubuntu Linux 20.04 Host installed with DRIVE OS Docker Containers
native Ubuntu Linux 18.04 Host installed with DRIVE OS Docker Containers
other

I am building the 5,15 kernel walking through the steps listed here: Kernel 5.15

At the step where you are to run this command:

sudo -E /usr/bin/python3 -B /opt/nvidia/driveos/common/filesystems/build-fs/17/bin/build_fs.py -w ${NV_WORKSPACE}/ -i $PWD/update_rfs.CONFIG.json -o ${NV_WORKSPACE}/drive-linux/filesystem/targetfs-images/

it fails with:

FileNotFoundError: [Errno 2] No such file or directory: '/drive/drive-linux/kernel/preempt_rt/modules/5.15.98-rt-tegra/updates/dkms/efa.ko'
2023-12-03 19:59:30,131 [ERROR]

Exiting due to error:
Command returned non-zero error code:
/usr/bin/python3 /opt/nvidia/driveos/common/filesystems//copytarget/1/copytarget.py /tmp/tmpeovxs9l7//targetfs/ /drive/ /drive//drive-linux//filesystem//copytarget/manifest//copytarget-kernel-modules.yaml --source-type pdk_sdk_installed_path   --filesystem-type standard
2023-12-03 19:59:30,131 [INFO]
Executing Cleanup Routine for Linux Build-FS on Exit.

2023-12-03 19:59:30,132 [INFO]
Executing Cleanup Routine for Build-FS on Exit.

efa.ko is located here:

./drive-linux/kernel/preempt_rt/modules/5.15.98-rt-tegra/kernel/drivers/infiniband/hw/efa/efa.ko

similar situation for pretty much all the infiniband related modules. Did I miss a step - or are the docs wrong?

Dear @collin.day,
I just checked in my docker and noticed /drive/drive-linux/kernel/preempt_rt/modules/5.15.98-rt-tegra/updates/dkms/efa.ko in my docker . Is it removed after following kernel recompilation steps?

Could you check and confirm?

Dear @collin.day,
Could you check if below steps works?

  1. To flash the built kernel, update the kernel Image, kernel modules, and then the filesystem

a. Copy the uncompressed (Image) kernel images to the top of the kernel directory with the following command:

export PROD_SUFFIX=""
sudo rm -fv $NV_WORKSPACE/drive-linux/kernel/preempt_rt${PROD_SUFFIX}/images/*
sudo cp -v ${PWD}/out-linux/arch/arm64/boot/Image ${PWD}/out-linux/vmlinux ${PWD}/out-linux/System.map $NV_WORKSPACE/drive-linux/kernel/preempt_rt${PROD_SUFFIX}/images/

CAUTION: Before copying the new kernel images, back up the default kernels provided.
b. Copy the built modules to the SDK kernel modules path:

cp -r $NV_WORKSPACE/drive-linux/kernel/preempt_rt${PROD_SUFFIX}/modules $NV_WORKSPACE/drive-linux/kernel/preempt_rt${PROD_SUFFIX}/modules_BKUP
sudo rm -rf $NV_WORKSPACE/drive-linux/kernel/preempt_rt${PROD_SUFFIX}/modules/*
mkdir -p  $NV_WORKSPACE/drive-linux/kernel/preempt_rt${PROD_SUFFIX}/modules/
sudo cp -a ${PWD}/out-linux/lib/modules/* $NV_WORKSPACE/drive-linux/kernel/preempt_rt${PROD_SUFFIX}/modules/
cp -r $NV_WORKSPACE/drive-linux/kernel/preempt_rt${PROD_SUFFIX}/modules_BKUP/5.15.98-rt-tegra/updates $NV_WORKSPACE/drive-linux/kernel/preempt_rt${PROD_SUFFIX}/modules/5.15.98-rt-tegra/

After compiling the kernel, up to step 12a, efa.ko is found here:

./drive-linux/filesystem/targetfs/usr/lib/modules/5.15.98-rt-tegra/updates/dkms/efa.ko
./drive-linux/kernel/preempt_rt/modules/5.15.98-rt-tegra/updates/dkms/efa.ko

After following your updated steps 12b - the call to build_fs.py works - and I think that the modules are being removed in this step:

sudo rm -rf $NV_WORKSPACE/drive-linux/kernel/preempt_rt${PROD_SUFFIX}/modules/* <------------
mkdir -p  $NV_WORKSPACE/drive-linux/kernel/preempt_rt${PROD_SUFFIX}/modules/
sudo cp -a ${PWD}/out-linux/lib/modules/* $NV_WORKSPACE/drive-linux/kernel/preempt_rt${PROD_SUFFIX}/modules/

and are only available again if you specify to rebuild them during menuconfig.

My question is why is this dependent at all on a module for Amazon Web Services? It would make more sense (in my mind) to make that optional and provide instructions to rebuild that when people build the kernel module. Either way - default instructions are probably not right for the 5.15 kernel / 6.0.8.1 SDK combination.

There is no update from you for a period, assuming this is not an issue any more.
Hence we are closing this topic. If need further support, please open a new one.
Thanks

Dear @collin.day,
Could you confirm if the above provided step help to fix the issue?

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.