How to run NVMe over RDMA target offload

bremen14 · April 17, 2024, 6:09am

Hello, I want to run NVMe over RDMA target offload with:
1: a X86 PC;
2: 2 Mellanox cx-6 cards;
3: An arm server installed with Linux 6.1;
4: an Intel D4800x nvme ssd； which can set io command queues and completion queues in its CMB;

I want to do the NVMe over RDMA target offload, and I try to connect the test platform like this:

a CX-6 will installed in PC;
2: a CX-6 and the D4800x are installed in the arm server;

I want to set the D4800x io queues in its CMB, and map the io queues to CX6’ receive and send queue;

I have a few confusion about how to enable my case:
1: Is there any limitation about the Arm server and its linux kernel ,such as the arm server should support P2P and it should be in the "host bridge whitelist "?
2: Should the Linux kernel of the arm server support “p2pmem”?
3: Should the “nvme” driver will be loaded automatically when linux boot up, in which case, the D4800 IO queues will be setup in cmb automatically.

gilh · April 17, 2024, 12:27pm

Hi

Nvidia NvmeOF offload don’t support CMB however we showed it as POC for selective customer.
Also Arm server is not official support as a NIC (We have Bluefield controller-DPU which base on Arm ) .

Best Regards

bremen14 · April 18, 2024, 1:02am

Thanks for your apply!
Then , I am wondering if it does not support CMB, then where are the NVMe IO queues loacated in target offload?
And how does peer-to-peer occur? The data is moved from NIC to NVMe directly or
rdma queues is directly mapped to NVMe queues inside NVMe, which are not in the host?

gilh · April 18, 2024, 6:41am

Hi
The Data move direct to the target DDR . In the target we handle three queue. I/O queue, completion queue and staging buffer . The p2p is between the host DDR and the target DDR.

Best Regards

bremen14 · April 18, 2024, 9:06am

Thanks for your clarification, so this p2p is not that one of PCIe, which moves data from NIC directly to CMB/DDR insid NVMe

gilh · April 18, 2024, 11:56am

I/O moves directly from initiator DDR to target DDR .NIC FW “offload” will take control on the PCI and move the data from target DDR to the SSD and opposite. In this way the CPU is “offload”

Best Regards

gilh · April 24, 2024, 7:18am

Hi

In the nvme connect you have -i parameter . If is not set (default) is per core (the minimum between initiator and target) . If is set (ex -I =8) you have 8 queues

Bet Regards

~WRD0001.jpg

bremen14 · April 26, 2024, 1:25am

So, when I use “nvme connect …” to create NVMe IO queues;
How does target NVMe setup its IO queues， and where does it create the queues?
I’ve read the “NVMe over fabric” spec, there is no answer.

gilh · April 28, 2024, 6:58am

If is without -i parameter, the default is the queue per core. If the number of core in the targe/initator is not equal it is the minimum between initiator/target

Best Regards

Gil

bremen14 · April 29, 2024, 2:06am

So, Do you mean：
1） the “nvme connect …" command will not acturelly create IO queues? It just use the IO Queues that NVMe driver created when it setup？and “nvme connect …” just do something like “mapping”?

Best Regards

gilh · April 30, 2024, 6:57am

The nvme driver don’t set up queue in case of fabric (NVMeOF). The nvme connect setup the queues per core as default and per -i parameter if you set this parameter.

Best Regards

user157267 · May 16, 2024, 5:05pm

Hi gilh,

I found the discussion on target offload intriguing and have some follow-up questions.

Firstly, could you please elaborate on the statement “Nvidia NvmeOF offload don’t support CMB”? My understanding was that NVMeOF with CMB was supported, as indicated by Stephen Bates and Mellanox presentations [1,2,3]. They mentioned the approach involving NVMeOF from Mellanox CX5 and NVMe CMB device, suggesting support for NVMeOF offload with CMB.

Additionally, I’m curious about why you mentioned the number of queues from the -i parameter. How does the number of queues relate to target offload? Why is it necessary to consider the -i parameter instead of just using the default setting?

1: https://www.snia.org/sites/default/files/SDC/2017/presentations/Solid_State_Stor_NVM_PM_NVDIMM/Bates_Stephen_p2pmem_Enabling_PCIe_Peer-2-Peer_in_Linux.pdf

2: https://nvmexpress.org/wp-content/uploads/Accelerating-NVMe-over-Fabrics-with-Hardware-Offloads.pdf

3: https://nvmexpress.org/wp-content/uploads/Session-2-Enabling-the-NVMe-CMB-and-PMR-Ecosystem-Eideticom-and-Mell....pdf

gilh · May 19, 2024, 7:18am

We have POC which show the capability of offload with CMB however is never got to production.
About the - i parameter, I just explain regard to num of queue. However, you can use the default.

Best Regards

Gil

user157267 · May 19, 2024, 8:19am

Hi Gilh,

Thank you for explaining the queue parameter; it’s now clear to me.

However, I’m a bit confused about the POC versus production-ready status. From the slides and talks I referenced, they did present CMB-target-offloading experiments with performance measurements on Mellanox ConnectX. Could you clarify if these experiments are not publicly production-ready on Mellanox ConnectX NIC + current Linux?

gilh · May 19, 2024, 8:38am

It plan in the past (to make it production)
Nvidia Nic don’t support CMB with offload

Best Regards

Gil

user157267 · May 19, 2024, 9:44am

Thanks for the prompt reply. I look forward to seeing NVIDIA Mellanox NICs being production-ready for CMB target offloading in the future.

Topic		Replies	Views
Clarification on NVMeOF Target Offload Behavior Ethernet Adapter Cards nvme-over-fabrics	2	487	March 19, 2024
Issues while configuring NVMe over Fabrics (NVMe-oF) Target Offload Mellanox OFED	2	1929	February 21, 2025
NVMeOF Target Offload on Software RAID Ethernet Adapter Cards nvme-over-fabrics	3	460	April 3, 2024
NVMe-oF test, "attr_offload" test failed BlueField	10	754	May 17, 2024
Interpretation of `num_p2p_queues` in NVMe-oF Target Offload Mellanox OFED nvme-over-fabrics	8	148	September 19, 2024
Setting up Mellanox NVMf offload Ethernet Adapter Cards	3	1234	October 10, 2019
Setting up NVMe-oF Target Offload: Unable to Set `num_p2p_queues` Parameter Mellanox OFED	3	158	September 18, 2024
ConnectX-5 error: Failed to write to /dev/nvme-fabrics: Invalid cross-device link Ethernet Adapter Cards lspci	4	3025	January 21, 2019
I can only connect 9 nvme devices. When I try to connect 10th device it is failing Mellanox OFED	5	931	March 9, 2024
CX-6 NoF offload parameter configuration InfiniBand/VPI Adapter Cards	1	656	June 6, 2023

How to run NVMe over RDMA target offload

Related topics