Max. 64 event queues on ConnectX-5?

Hello,

after installing MCX515A-CCAT_C11 into an EPYC server with 64 cores / 128 threads, only 64 event queues and MSI interrupts were activated (one mlx5_async and 63 mlx5_comp*). Is this HW limit of this card or is it intentionally limited in the driver to 64 for performance reasons? Other PCI devices create 128 MSI interrupts in this server and distribute them around all CPUs.

Thanks

Thanks

Hello,

64 event queues is the default value on NVIDIA NICs.
This can be changed to 128 with mlxconfig parameters.
You can read the current value:
mlxconfig -d /dev/mst/ q | grep NUM_PF_MSIX
~ NUM_PF_MSIX 63

(63 is actually 64 interrupts)

And this can be changed to 128 like so:
mlxconfig -d /dev/mst/ NUM_PF_MSIX=127
(127 is actually 128).

If there are any additional questions or issues regarding this, please open a case with enterprisesupport@nvidia.com and it will be handled based on entitlement.

Thanks,
Jonathan.

Hi Jonathan,

actually NUM_PF_MSIX=63 means, that 1 out of 64 available HW cores does not participate in handling network traffic. Even with setting NUM_PF_MSIX to 127, again one CPU won’t be handling traffic.

Why is this the default?

I managed to set NUM_PF_MSIX=64 which created 66 MSI interrupts and now all 64 HW cores are handling traffic and there’s still room for mlx5_async to have its own separate MSI interrupt.

Is such setup OK or would it have some unexpected side-effects?

BTW, 127 is the maximum value the firmware accepts, so setting it to 128 is not possible.

Thanks