I have a OSFP 2x400Gb Optical Module inserted into one of the ports on QM9700 Infiniband Switch.
Over a period of 6-8 hours, I’m getting FEC errors up to Bin 10, even though the Raw Physical BER is good (7E-12). What could be causing this little burst of FEC error(s) after running for some time?
Following is the information reported by the QM9700 on this port:
switch-09145c [standalone: master] (config fae) # mlxlink -d lid-1 -c -e --show_histogram --rx_fec_histogram -p 23/2
Operational Info
State : Active
Physical state : LinkUp
Speed : IB-NDR
Width : 4x
FEC : Standard_RS-FEC - (544,514)
Loopback Mode : No Loopback
Auto Negotiation : ON
Supported Info
Enabled Link Speed : 0x00000080 (NDR)
Supported Cable Speed : 0x00000080 (NDR)
Troubleshooting Info
Status Opcode : 0
Group Opcode : N/A
Recommendation : No issue was observed
Tool Information
Firmware Version : 31.2012.3040
amBER Version : 2.17
MFT Version : mft 4.25.0-203
Physical Counters and BER Info
Time Since Last Clear [Min] : 1557.0
Symbol Errors : 0
Symbol BER : 15E-255
Effective Physical Errors : 0
Effective Physical BER : 15E-255
Raw Physical Errors Per Lane : 11701,196498,1779,40683
Raw Physical BER : 7E-12
Link Down Counter : 0
Link Error Recovery Counter : 0
EYE Opening Info
FOM Mode : SLRG_FOM_MODE_EYEO
Lane : 0, 1, 2, 3
Initial FOM : 80, 82, 81, 94
Last FOM : 85, 87, 82, 97
Upper Grades : 80, 87, 84, 101
Mid Grades : 110, 117, 118, 129
Lower Grades : 83, 92, 78, 96
Histogram of FEC Errors
Header : Range Occurrences
Bin 0 : [0] 7298988504424
Bin 1 : [1] 56597
Bin 2 : [2] 93947
Bin 3 : [3] 1685
Bin 4 : [4] 273
Bin 5 : [5] 1
Bin 6 : [6] 0
Bin 7 : [7] 0
Bin 8 : [8] 1
Bin 9 : [9] 0
Bin 10 : [10] 1
Bin 11 : [11] 0
Bin 12 : [12] 0
Bin 13 : [13] 0
Bin 14 : [14] 0
Bin 15 : [15] 0