How exactly does the Adapter or Driver decide on the number of queues?

Hi, i hope you’re doing well,

I got a CX4131A-GCAT Mellanox adapter card and installed it on my PowerEdge R840 server running RHEL8. My goal was to have as many queues as possible, preferably 512 queues.

But when i checked using ethtool -l, i saw the following:

Pre-set maximums:

RX:             n/a

TX:             n/a

Other:          n/a

Combined:       63

Current hardware settings:

RX:             n/a

TX:             n/a

Other:          n/a

Combined:       63

And i was wondering:

  1. How does the Adapter or Driver determine the number of queues? i read somewhere that it’s based on the number of CPU cores and other HW specs, but i would like to know where to look to find more details. (because i looked for the documentations and i couldn’t find anything about the limits or how exactly they’re determined)
  2. Is it possible to increase the number of queues for my card? because i do have another XL710 card on the same server and that one shows 96 queues.
  3. If i wanted 512 queues, what recommendations would you have for me?

here’s the information of the NIC’s driver that i got from ethtool -i:

driver: mlx5_core
version: 6.13.9-Patched+
firmware-version: 14.20.1010 (MT_2430110032)

Please let me know if there’s any other information that you’d need me to provide to help you help me with these questions.

Thank you in advance,
Chris.

Hello Chris,

Thank you for your inquiry about the queue configuration on your CX4131A-GCAT Mellanox adapter card. I’ll address each of your questions regarding network queues.

## How the Adapter/Driver Determines the Number of Queues

The number of queues available on a Mellanox adapter is determined by several factors:

1. **Hardware Capabilities**: Each ConnectX adapter has a maximum number of queues it can support based on its hardware architecture. The ConnectX-4 series has specific limitations built into the hardware.

2. **Driver Implementation**: The mlx5_core driver allocates queues based on:

  • Available hardware resources on the adapter

  • System memory constraints

  • Number of CPU cores in the system

  • PCI bandwidth considerations

3. **Firmware Version**: The firmware (14.20.1010 in your case) can also influence the maximum number of queues exposed to the operating system.

The combined value of 63 queues you’re seeing is likely the maximum that your specific card model can support with the current firmware and driver combination.

## Increasing the Number of Queues

For your specific card (CX4131A-GCAT), the maximum number of queues is hardware-limited. While the XL710 card you mentioned supports 96 queues, the ConnectX-4 LX has different hardware architecture and capabilities.

You can try the following to ensure you’re getting the maximum possible queues:

1. **Update Firmware**: Check if there’s a newer firmware available for your adapter that might support more queues.

2. **Update MLNX_OFED Driver**: Using the latest NVIDIA Mellanox OFED driver package might provide optimizations that allow for more efficient queue allocation.

3. **Kernel Parameters**: You can try adding kernel boot parameters to allocate more memory for the network subsystem, though this is unlikely to exceed the hardware limits.

4. **Queue Distribution**: While you can’t increase beyond the hardware maximum, you can optimize how queues are distributed using commands like:

```

ethtool -L combined

```

## Recommendations for 512 Queues

To achieve 512 queues, you would need to consider:

1. **Hardware Upgrade**: The ConnectX-5 or newer adapters support significantly more queues than the ConnectX-4 series. The ConnectX-6 Dx, for example, can support up to 512 queues per port.

2. **Multiple NICs**: You could use multiple ConnectX-4 cards and distribute your workload across them.

3. **SR-IOV Configuration**: If your workload is virtualized, using SR-IOV with multiple virtual functions might help distribute the processing more efficiently, even with the queue limitation.

## Additional Information

For your specific adapter (CX4131A-GCAT ConnectX-4 LX), the hardware architecture limits the maximum number of combined queues to 63, which is what you’re seeing. This is a hardware limitation that cannot be exceeded through software configuration.