Issues with nv-p2p.h and nvidia-peermem for Jetpack 6.0

rubens.lima · October 16, 2024, 6:52pm

Hi! I was developing a loadable kernel driver based on jetson-rdma-picoevb, using GPUDirect. The driver was tested with R35.2.1, but I’m migrating to R36.3 (Jetpack 6).

In the GPUDirect documentation, there is a section called changes in Cuda 12.2 that reports the deprecation of the nv_peer_mem, and the new module nivida_peermem.

I searched on my system, and I can see the module on nvidia-oot folder. I understand the nv_peer_mem should be in this directory, and there would be an nvidia_peermem anywhere, but it does not yet exist.

When I tried to build my driver, a fatal error was raised:

linux/nv-p2p.h: No such file or directory

I tried the Using Nvidia Peermem but I can’t understand, there is a info popup:

Note: If the NVIDIA GPU driver is installed before MLNX_OFED, the GPU driver must be uninstalled and installed again to make sure nvidia-peermem is compiled with the RDMA APIs that are provided by MLNX_OFED.

It’s not clear for me. I found at some topics: “nvidia-jetpack contains all drivers and essential softwares”. So, I uninstalled using “apt autoremove --purge nvidia-jetpack”, and I installed the MLNX_OFED with success. After this, I installed the nvidia-jetpack again. However, there is not any nvidia-peermem module yet.

But, I installed manually the nvidia-driver-550 and the module nvidia-peermem was installed, obviously I can’t use this, because it’s not compiled to arm.

What are my mistakes? How can I configure the GPUDirect with this jetpack version?? Do I have to decrease my jetpack version???

AastaLLL · October 17, 2024, 3:08am

Hi,

We need to check nvidia-peermem issue with our internal team.
But we can find linux/nv-p2p.h in the below directory:

/usr/src/nvidia/nvidia-oot/include/linux/nv-p2p.h

Does your driver fail with the missing nv-p2p.h file only?
Thanks.

rubens.lima · October 17, 2024, 5:52pm

Hi, AastaLLL! Thanks a lot!

So, I tried to include the path in my Makefile, the nv-p2p was found, but another error was raised for some methods:

ERROR: modpost: “nvidia_p2p_*” [<path>] undefined!

It looks like the header is known, but the actual implementation is undefined. So I tried to install manually the nv_peer_mem (Until now, this module isn’t available in lsmod or modprobe). Is it the correct way to use? I wasn’t successful.

Is there a guidance or tips to use the out-of-tree nv_peer_mem? Should I install this module manually?

rubens.lima · October 21, 2024, 7:19pm

Hi, AastaLLL! Any updates?

AastaLLL · October 23, 2024, 6:25am

Hi,

Sorry for the late update.

We are checking the status with our internal team.
Will share more info with you later.

Thanks.

AastaLLL · October 24, 2024, 2:21am

Hi,

Our internal has validated the RDMA driver so it’s expected to work.
Please note that you need to checkout the rel-36+ branch for the JetPack 6 environment.

Regarding “linux/nv-p2p.h: No such file or directory” error, please update include file and symbols as like below.

KBUILD_EXTRA_SYMBOLS=$(NVIDIA_EXTRA_SYMBOLS)
ccflags-y += -I$(NVIDIA_SRC_DIR)

Also refer this to get the NVIDIA_SRC_DIR and KBUILD_EXTRA_SYMBOLS.

Thanks.

rubens.lima · October 24, 2024, 7:33pm

Thanks a lot, AastaLLL!

system · November 19, 2024, 6:13am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
How to use nvidia-peermem? Jetson AGX Orin cuda	8	1071	March 10, 2025
Using nv_peer_mem for GPUDirect with JetPack 6 Jetson AGX Orin gpu	2	62	October 23, 2024
Using nvidia-peermem for GPUDirect with JetPack 6 Jetson AGX Orin gpu	2	67	October 23, 2024
Module nvidia-peermem not found BlueField cuda , ubuntu	2	2633	December 21, 2023
[GH200] Invalid Symbol When Loading nvidia-peermem RDMA Software For GPU	3	965	July 29, 2024
Is there any documentation about nv_peer_mem and nvidia_peermem? CUDA Programming and Performance	0	1344	August 28, 2021
Mellanox OFED GPUDirect RDMA for AGX Xavier Jetson AGX Xavier	5	931	July 11, 2023
Problem installing nvidia-peer-memory: Error! Bad return status for module build on kernel: 4.15.0-161-generic (aarch64) Jetson AGX Xavier networking	4	1477	December 2, 2021
Installation of nvidia-peermem for multi-node GPUDirect RDMA GPU-Accelerated Libraries	0	953	April 27, 2023
Nvshmem3.2.5+doca2.9.1failed to detect nvidia-peermem.ko TensorRT	1	37	May 19, 2025

Issues with nv-p2p.h and nvidia-peermem for Jetpack 6.0

Related topics