I am currently setting up NVMe-oF with target offload using the ConnectX-5 NIC, following the official tutorial (ESPCommunity). The setup was successful, but I am looking for a deeper understanding of the num_p2p_queues parameter.
Specifically, I would like to know how num_p2p_queues impacts the performance of the target offload. Does increasing the number of num_p2p_queues directly improve the performance of the target offload? Is it designed to scale with the number of NVMe SSDs being offloaded by the target, or does it primarily relate to increasing the number of CPU cores on the host side that are handling the I/O operations?
Hi,
there is no impact of performance for the target offload. The purpose of this parameter is to “save” dedicated nvme queues for P2P. Increasing this number may cause the nvme pci driver to create only a single IO queue for CPU usage.
Thanks for your questions.
num_p2p_queues is per SSD. 1 is enough if you don’t use “high availability in a multipath setup”.
P2P queues are used only for traffic over fabric. The “CPU” queues are the queues that the nvme pci opens for local traffic on the target machine.
I just want to confirm my understanding: If there are 8 SSDs at the target, setting modprobe nvme num_p2p_queues=1 should be sufficient to offload all 8 SSDs. Is that correct?
Also does num_p2p_queues has a upperlimit?
Because I’ve encountered several kernel driver issues that require a reboot to resolve, especially when I set num_p2p_queues too high (e.g., 16) and the host uses many threads for I/O. In these cases, the target driver (mlx5_core) times out.
[Di Sep 17 10:50:22 2024] mlx5_core 0000:60:00.0: wait_func:1186:(pid 30324): DESTROY_MKEY(0x202) canceled on out of queue timeout.
[Di Sep 17 10:50:22 2024] mlx5_core 0000:60:00.0: wait_func:1186:(pid 31336): DESTROY_MKEY(0x202) canceled on out of queue timeout.
[Di Sep 17 10:50:22 2024] mlx5_core 0000:60:00.0: wait_func:1186:(pid 32267): DESTROY_MKEY(0x202) canceled on out of queue timeout.
[Di Sep 17 10:50:22 2024] mlx5_core 0000:60:00.0: wait_func:1186:(pid 33208): DESTROY_MKEY(0x202) canceled on out of queue timeout.
Thank you for the clarification.
By “nvmet port,” I think you are referring to /sys/kernel/config/nvmet/ports.
Assuming again there are 8 SSDs at the target, and each is planned to be configured with its own individual nvmet port, does this mean the num_p2p_queues parameter should be set to 8 to offload those 8 nvmet port and SSDs?
Correct: /sys/kernel/config/nvmet/ports
Target offload of 8 SSDs on one port will require
num_p2p_queues=1, if the same 8 SSDs with two ports, then num_p2p_queues=2