GPU not loading when updating to JetPack 4.6

Hello,

I’ve been developing a custom mipi camera. I’m connecting the camera to the Jetson Xavier NX. Everything is working correctly in JetPack 4.5: I have video and my video devices powered by v4l2 and a custom driver is working correctly. The gpu gets properly loaded and I’m able to see the module with lsmod.

I want to update to JetPack 4.6, but the gpu drivers does not get loaded properly (no module loaded when using lsmod). The device tree is the same as the one I use in JetPack 4.5. I made sure that the kernel is also the latest one (4.9.253-tegra).

In dmesg I get the following errors:
nvgpu: Unknown symbol l1ss_deregister_client (err 0)
nvgpu: Unknown symbol l1ss_submit_rq (err 0)
nvgpu: Unknown symbol l1ss_register_client (err 0)

Kind regards,
Goran Broeckaert

I guess you already told the root cause of this issue. Your device tree is still the same. And that is why it is causing error.

If you understand what is device tree doing for kernel, you should know that they are paired. When kernel getting updated, your device tree should get updated too.

You should not use something like device tree from release A with kernel from release B.

Also, since nvgpu is a kernel module but not inside kernel image, you have to make sure that your kernel module version matches the kernel image too.

I compile the kernel myself. In my top lvl dts I include tegra194-p3668-common.dtsi. This dtsi comes with the kernel, so I assume this is updated aswel, when I’m updating the kernel.

Although I will look further into this and see if I can find something.

Thanks for the fast reply.

Please make sure you also rebuild the nvgpu driver and followed the method from developer guide to install the whole modules.

I’ve compared my device tree and the device tree that is used in the example for JetPack 4.6, and everything related to the gpu (gv11b is the name of the device in the device tree) is the same. I’ve made sure that the kernel module is installed. I can find the kernel model in “Linux_for_Tegra/rootfs/lib/modules/4.9.253-tegra/kernel/drivers/gpu/nvgpu/nvgpu.ko”.

I’ve looked at the error, that there are symbols missing. So I’ve done some searching, to actually know whether those symbols are actually defined in the kernel sources. These symbols are there, so would it be possible that these aren’t compiled with the kernel?

Should there be any changes in the defconfig, since I did also modify the defconfig to better suit my needs?

Kind regards,
Goran Broeckaert

I’ve looked into the defconfig differences, and the Kconfig of the module responsible for those symbols, and I was missing a configuration to setup the nvidia driver. I added the following 2 lines: CONFIG_SND_HDA_INTEL=m and CONFIG_TEGRA_SAFETY=y. I assume the critical one is the latter.

Thanks for the help,

Kind regards