MCX516A-CDAT, 100G dual port. On Supermicro 4124-GS-TNR, the general situation is that I use ib_write_bw for single-port bidirectional testing.
The first speed test, 196Gb/s is no problem, then I restarted, it became 165Gb/s directly, and then I restarted it may still be 165Gb/s, but after restarting again, it changed back to 196Gb/s
And when the speed measurement is 165Gb/s, it is often accompanied by ib_write_bw speed measurement stuck. It is stuck after the command is executed, maybe both ports are stuck, or the second port can run
MCX516A-CDAT, 100G dual port. On Supermicro 4124, the general situation is that I use ib_write_bw for single-port bidirectional testing.
The first speed test, 196Gb/s is no problem, then I restarted, it became 165Gb/s directly, and then I restarted it may still be 165Gb/s, but after restarting again, it changed back to 196Gb/s
And when the speed measurement is 165Gb/s, it is often accompanied by ib_write_bw speed measurement stuck. It is stuck after the command is executed, maybe both ports are stuck, or the second port can run
server [d1]:
ib_write_bw -d mlx5_0 -R --run_infinitely --report_gbits
client[d2]:
ib_write_bw -d mlx5_0 -R --run_infinitely -b --report_gbits 192.168.200.25
When the test rate is normal, it shows the rate in the screenshot
When it is not normal, the machine port is stuck, unable to test, and then the following error is reported. And what I do is just restart the machine, and then execute mst start
Of course, it is also possible to test normally after restarting, but the rate will drop to 165Gb/s, and will not increase, and will maintain this rate
I uploaded the file running ibdiagnet --pc --pm_pause_time 600 -P all=1 --get_phy_info --get_cable_info and running sysinfo-snapshot.py to the attachment
d1.zip (2.2 KB)
d2.zip (2.2 KB)
sysinfo-snapshot-v3.7.0-d1-20230410-112904.tgz (4.9 MB)
sysinfo-snapshot-v3.7.0-d2-20230410-112901.tgz (5.4 MB)