Issues with nv-p2p.h and nvidia-peermem for Jetpack 6.0

Hi! I was developing a loadable kernel driver based on jetson-rdma-picoevb, using GPUDirect. The driver was tested with R35.2.1, but I’m migrating to R36.3 (Jetpack 6).

In the GPUDirect documentation, there is a section called changes in Cuda 12.2 that reports the deprecation of the nv_peer_mem, and the new module nivida_peermem.

I searched on my system, and I can see the module on nvidia-oot folder. I understand the nv_peer_mem should be in this directory, and there would be an nvidia_peermem anywhere, but it does not yet exist.

When I tried to build my driver, a fatal error was raised:

linux/nv-p2p.h: No such file or directory

I tried the Using Nvidia Peermem but I can’t understand, there is a info popup:

Note: If the NVIDIA GPU driver is installed before MLNX_OFED, the GPU driver must be uninstalled and installed again to make sure nvidia-peermem is compiled with the RDMA APIs that are provided by MLNX_OFED.

It’s not clear for me. I found at some topics: “nvidia-jetpack contains all drivers and essential softwares”. So, I uninstalled using “apt autoremove --purge nvidia-jetpack”, and I installed the MLNX_OFED with success. After this, I installed the nvidia-jetpack again. However, there is not any nvidia-peermem module yet.

But, I installed manually the nvidia-driver-550 and the module nvidia-peermem was installed, obviously I can’t use this, because it’s not compiled to arm.

What are my mistakes? How can I configure the GPUDirect with this jetpack version?? Do I have to decrease my jetpack version???

Hi,

We need to check nvidia-peermem issue with our internal team.
But we can find linux/nv-p2p.h in the below directory:

/usr/src/nvidia/nvidia-oot/include/linux/nv-p2p.h

Does your driver fail with the missing nv-p2p.h file only?
Thanks.

Hi, AastaLLL! Thanks a lot!

So, I tried to include the path in my Makefile, the nv-p2p was found, but another error was raised for some methods:

ERROR: modpost: “nvidia_p2p_*” [<path>] undefined!

It looks like the header is known, but the actual implementation is undefined. So I tried to install manually the nv_peer_mem (Until now, this module isn’t available in lsmod or modprobe). Is it the correct way to use? I wasn’t successful.

Is there a guidance or tips to use the out-of-tree nv_peer_mem? Should I install this module manually?

Hi, AastaLLL! Any updates?

Hi,

Sorry for the late update.

We are checking the status with our internal team.
Will share more info with you later.

Thanks.

Hi,

Our internal has validated the RDMA driver so it’s expected to work.
Please note that you need to checkout the rel-36+ branch for the JetPack 6 environment.

Regarding “linux/nv-p2p.h: No such file or directory” error, please update include file and symbols as like below.

KBUILD_EXTRA_SYMBOLS=$(NVIDIA_EXTRA_SYMBOLS)
ccflags-y += -I$(NVIDIA_SRC_DIR)

Also refer this to get the NVIDIA_SRC_DIR and KBUILD_EXTRA_SYMBOLS.

Thanks.

Thanks a lot, AastaLLL!

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.