Now I have also tried downgrading to R34.1.1 and JetPack 5.0.1.
It has exactly the same problem.
Hi,
How did you rebuild this ko file? Do you use the same toolchain as the original kernel?
I used the toolchain on the jetson.
I have successfully built other kernel modules this way. But they did not depend on other modules…
So in this case I must cross compile with Bootlin Toolchain gcc 9.3 since this module it depends on a builtin module?
I now did the following:
I flashed the 35.1 release and installed JetPack.
I compiled the kernel on my PC as described here:
https://docs.nvidia.com/jetson/archives/r35.1/DeveloperGuide/text/SD/Kernel/KernelCustomization.html
with the Driver Package (BSP) Sources and Bootlin Toolchain gcc 9.3 from:
https://developer.nvidia.com/embedded/jetson-linux
I successfully cross compiled my kernel module linked to the built kernel with:
export CROSS_COMPILE_AARCH64_PATH=~/jetson/l4t-gcc/
export CROSS_COMPILE_AARCH64=~/jetson/l4t-gcc/bin/aarch64-buildroot-linux-gnu-
export TEGRA_KERNEL_DIR=~/jetson/kernel/35.1/Linux_for_Tegra/source/public/kernel/
export CROSS_COMPILE=~/jetson/l4t-gcc/bin/aarch64-buildroot-linux-gnu-
make ARCH=arm64 -C $TEGRA_KERNEL_DIR../kernel_out M=$PWD
But it still get the “disagrees about version of symbol” on the Jetson AGX Orin :-(
What am I missing?
Hi,
I think this driver is not validated on jetpack5 before. And its dependency has problem too.
For example, nvidia_p2p_get_pages seems not really exist.
Some more observations. One problem seems to be that nvidia-p2p is not loaded. When trying to load this module manually it fails with “exports duplicate symbol” owned module nividia. I did the an experiment to unload the nvidia module. It is used by the graphical system so it must be disabled first.
sudo systemctl set-default multi-user.target
sudo reboot
*LOGIN AFTER REBOOT*
sudo modprobe -r nvidia
sudo modprobe nvidia-p2p
sudo insmod gdrdrv.ko
And hey i can load my module! Even the module built locally on the Jetson can be loaded. I have no means to actually verify p2p functionality at this stage.
WARNING! Doing this seems to kill the DisplayPort output and you can only access it with ssh even after reboot!
You can restore the system DisplayPort output with:
sudo systemctl set-default graphical.target
sudo reboot
Hi,
We are checking this issue with our internal team.
Will share more information with you later.
Thanks
For jetson-rdma-picoevb, how are you compiling kernel module it? I mean as iGPU or dGPU.
Hi,
We can find the nvidia_p2p_get_pages symbol in the kernel_src.tbz2 or r35.1.
Could you please check it again?
$ grep -ir nvidia_p2p_get_pages
kernel/nvidia/drivers/nv-p2p/nvidia-p2p.c:int nvidia_p2p_get_pages(u64 vaddr, u64 size,
kernel/nvidia/drivers/nv-p2p/nvidia-p2p.c:EXPORT_SYMBOL(nvidia_p2p_get_pages);
kernel/nvidia/include/linux/nv-p2p.h:int nvidia_p2p_get_pages(u64 vaddr, u64 size,
kernel/nvidia/include/linux/nv-p2p.h: * Map the pages retrieved using nvidia_p2p_get_pages and
Thanks.
Hi,
We just got some feedback from our internal team that t nv-p2p.ko and nvidia.ko cannot be used together.
Do you want to use them at the same time?
Thanks.
I want to do p2p and have display output simultaneously. I would assume I need both for that. Am I correct?
nvidia.ko, nvidia-modeset and nvgpu.ko are responsible for the display to work fine.
Is nvidia.ko required to run programs using Cuda ?
The jetson-rdma-picoevb is build with the script for the iGPU of the Jetson on the Jetson itself.
Hi, both
The same symbols are defined in both nvidia.ko and nv-p2p.ko.
So they cannot be added to the kernel at the same time.
nvidia.ko was only loaded for the dGPU use case.
That’s why we don’t expect it will be loaded when designing the nv_p2p.ko.
We are double-checking if nvidia.ko is required for Orin’s functionality.
Could you also test if it works by only adding the nv-p2p.ko into the kernel?
Thanks.
Hi,
The nvidia.ko is used for display from Orin.
It is also possible to affect some functionality that requires the graphic driver. (ex. argus)
Thanks.
I have already verified that loading only nv-p2p.ko works. See (GPUDirect RDMA - Module can not be insert into kernel - #10 by DigPat)
I’m not sure I understand… but lets start with my goal.
I have a card with an onboard FPGA connected to the PCIe slot on the ORIN. I want to do peer 2 peer data transactions using functions defined in nv-p2p.h to the iGPU memory using Cuda.
- Is this possible?
- What module should I load?
- Can I also have display output at the same time?
Thanks for the suggestion. At least I can load my PCIe device driver kernel module now.
Looking forward to hear from nvidia for a fix.
Hi,
Sorry for the missing.
Currently, nvidia.ko and nv-p2p.ko cannot be added together due to the symbol issue.
Without nvidia.ko, the graphic driver cannot be loaded, so the display and some feature that requires the graphic driver won’t work.
But a pure CUDA application should not be affected.
We are discussing the possible fix with our internal team.
Will share more information if we got more feedback.
The answer to your question:
- Yes
- nv-p2p.ko only
- Unfortunately no.
Thanks.
Thanks for clarifying. I hope you can find a solution soon.