Interpretation of `num_p2p_queues` in NVMe-oF Target Offload

Hi NVMe-oF team,

I am currently setting up NVMe-oF with target offload using the ConnectX-5 NIC, following the official tutorial (ESPCommunity). The setup was successful, but I am looking for a deeper understanding of the num_p2p_queues parameter.

Specifically, I would like to know how num_p2p_queues impacts the performance of the target offload. Does increasing the number of num_p2p_queues directly improve the performance of the target offload? Is it designed to scale with the number of NVMe SSDs being offloaded by the target, or does it primarily relate to increasing the number of CPU cores on the host side that are handling the I/O operations?

I came across an answer suggesting that setting this parameter to 2 is recommended for high availability in a multipath setup: I can only connect 9 nvme devices. When I try to connect 10th device it is failing - #2 by spruitt
However, I am seeking a more detailed explanation of this parameter’s purpose and impact on performance.

Thank you for your insights!

Hi,
there is no impact of performance for the target offload. The purpose of this parameter is to “save” dedicated nvme queues for P2P. Increasing this number may cause the nvme pci driver to create only a single IO queue for CPU usage.

Best Regards,
Anatoly

1 Like

Hi Anatoly,

thanks. I am wondering does only 1 queue enough if the target has multiple SSDs to be offloaded?

Also I am not clear with “a single IO queue for CPU usage” could you explain me that? Thanks

Thanks for your questions.
num_p2p_queues is per SSD. 1 is enough if you don’t use “high availability in a multipath setup”.
P2P queues are used only for traffic over fabric. The “CPU” queues are the queues that the nvme pci opens for local traffic on the target machine.

Best Regards,
Anatoly

1 Like

@abirman Thank you for the reply.

I just want to confirm my understanding: If there are 8 SSDs at the target, setting modprobe nvme num_p2p_queues=1 should be sufficient to offload all 8 SSDs. Is that correct?
Also does num_p2p_queues has a upperlimit?


Because I’ve encountered several kernel driver issues that require a reboot to resolve, especially when I set num_p2p_queues too high (e.g., 16) and the host uses many threads for I/O. In these cases, the target driver (mlx5_core) times out.

[Di Sep 17 10:50:22 2024] mlx5_core 0000:60:00.0: wait_func:1186:(pid 30324): DESTROY_MKEY(0x202) canceled on out of queue timeout.
[Di Sep 17 10:50:22 2024] mlx5_core 0000:60:00.0: wait_func:1186:(pid 31336): DESTROY_MKEY(0x202) canceled on out of queue timeout.
[Di Sep 17 10:50:22 2024] mlx5_core 0000:60:00.0: wait_func:1186:(pid 32267): DESTROY_MKEY(0x202) canceled on out of queue timeout.
[Di Sep 17 10:50:22 2024] mlx5_core 0000:60:00.0: wait_func:1186:(pid 33208): DESTROY_MKEY(0x202) canceled on out of queue timeout.

Hi,

Yes, this is correct, but if you configure two nvmet ports (all ports expose the same SSD), then you may need to set it to num_p2p_queues=2

Best Regards,
Anatoly

1 Like

Thank you for the clarification.
By “nvmet port,” I think you are referring to /sys/kernel/config/nvmet/ports.

Assuming again there are 8 SSDs at the target, and each is planned to be configured with its own individual nvmet port, does this mean the num_p2p_queues parameter should be set to 8 to offload those 8 nvmet port and SSDs?

Correct: /sys/kernel/config/nvmet/ports
Target offload of 8 SSDs on one port will require
num_p2p_queues=1, if the same 8 SSDs with two ports, then num_p2p_queues=2

Best Regards,
Anatoly

1 Like

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.