I have installed Mellanox ConnectX-4 Infiniband cards in a small cluster. I installed the drivers using the RPM method described on the website, I have tested both the MLNX and UPSTREAM packages from the ISO.
https://docs.mellanox.com/display/MLNXOFEDv461000/Installing+Mellanox+OFED
The physical link information looks good but all performance tests I run (iperf, iperf3, qperf, ib_send_bw) show less than expected performance (14-28Gbps depending on test). I realize that there might be some performance tuning possible but the base performance I am seeing looks well below what could be improved by tweaking some kernel or tcp settings.
The cluster is AMD EPYC based but I re-tested point-to-point using a couple Intel Xeon based systems also so I believe I have a configuration issue. In the past, I had used CONNECTED_MODE and MTU=65520 but that is not an option with this latest driver.
I have included all pertinent information I can think of in the attached files. If there is anything else I can provide, please let me know. Any help you can provide would be greatly appreciated.
mellanox-infiniband-details.txt (18.5 KB)
sysinfo-snapshot-v3.5.0-n001.cluster.com-20200417-025257.tgz (953 KB)