Hello everyone,
I have the following equipment:
-
2xNICs: MCX455A-FCAT PCIe-gen3 x16 (each one installed in a separate node)
-
Switch: MSX610F-BS
Both the (VPI) NIC and the Switch are configured to Infiniband mode.
And I am running the following experiment:
I have two servers and I am sending small messages (~30B) through UD QPs from one server to the other (both-directions).
I have highly optimized the code (e.g. batching to the NIC, inlining, selective signalling, multiple QPs etc)
I run such an experiment with two different configurations.
-
I am connecting the servers through the Switch.
-
I am directly connecting the NICs back-to-back (via a single cable)
The strange thing is that I get different results. More precisely by reading the mellanox counters I see different Performance in terms of packets / sec.
-
With the switch I get around 63Mpps (in each direction)
-
On the other hand without the switch I get up to 80 Mpps (per direction) and at this point → I am highly confident that I am bottlenecked by the PCIe
So my question is the following.
- Do I have a defective/misconfigured switch or its common for a switch to not operate on line rate for small packets (by having less forwarding rate)?
P.S. Also since I have 2 servers only connected through the switch I don’t think I may have any congestion or something else that could explain the degradation. Am I missing something?