We replaced an existing CX5 card with a new CX6 card. The CX5 card was connexted with dual 100G SFP. After the swap, we could not longer connext at 100G, but swapping in 40G worked. We see that the card is connecting at Gen 3 and not Gen 4 and therefore isn’t providing enough bandwidth.
Some info below:
mlxlink
PCIe Operational (Enabled) Info
-------------------------------
Depth, pcie index, node : 0, 0, 0
Link Speed Active (Enabled) : 8G-Gen 3 (16G-Gen 4)
Link Width Active (Enabled) : 16X (16X)
EYE Opening Info (PCIe)
-----------------------
Physical Grade : 1888, 1426, 1800, 1740, 1833, 1566, 2016, 1767, 1711, 1740, 1800, 2442, 1664, 1458, 1624, 1736
Height Eye Opening [mV] : 151, 114, 144, 139, 146, 125, 161, 141, 136, 139, 144, 195, 133, 116, 129, 138
Phase Eye Opening [psec] : 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16
Management PCIe Performance Counters Info
-----------------------------------------
RX Errors : 0
TX Errors : 19
CRC Error dllp : 0
CRC Error tlp : 0
Effective ber : 15E-255
dmesg/kernel
[ 3.675704] mlx_compat: loading out-of-tree module taints kernel.
[ 3.675815] mlx_compat: module verification failed: signature and/or required key missing - tainting kernel
[ 3.708709] mlx5_core 0000:5e:00.0: firmware version: 22.32.2004
[ 3.708742] mlx5_core 0000:5e:00.0: 126.016 Gb/s available PCIe bandwidth, limited by 8.0 GT/s PCIe x16 link at 0000:5d:00.0 (capable of 252.048 Gb/s with 16.0 GT/s PCIe x16 link)
[ 4.005266] mlx5_core 0000:5e:00.0: Rate limit: 127 rates are supported, range: 0Mbps to 97656Mbps
[ 4.005697] mlx5_core 0000:5e:00.0: E-Switch: Total vports 10, per vport: max uc(128) max mc(2048)
[ 4.012100] mlx5_core 0000:5e:00.0: Port module event: module 0, Cable plugged
[ 4.012465] mlx5_core 0000:5e:00.0: mlx5_pcie_event:299:(pid 1010): Detected insufficient power on the PCIe slot (27W).
[ 4.046062] mlx5_core 0000:5e:00.0: mlx5_fw_tracer_start:830:(pid 932): FWTracer: Ownership granted and active
[ 4.052631] mlx5_core 0000:5e:00.0: MLX5E: StrdRq(1) RqSz(8) StrdSz(2048) RxCqeCmprss(0)
[ 4.222914] mlx5_core 0000:5e:00.0: Supported tc offload range - chains: 4294967294, prios: 4294967295
[ 4.244545] mlx5_core 0000:5e:00.1: firmware version: 22.32.2004
[ 4.244602] mlx5_core 0000:5e:00.1: 126.016 Gb/s available PCIe bandwidth, limited by 8.0 GT/s PCIe x16 link at 0000:5d:00.0 (capable of 252.048 Gb/s with 16.0 GT/s PCIe x16 link)
[ 4.559094] mlx5_core 0000:5e:00.1: Rate limit: 127 rates are supported, range: 0Mbps to 97656Mbps
[ 4.559565] mlx5_core 0000:5e:00.1: E-Switch: Total vports 10, per vport: max uc(128) max mc(2048)
[ 4.566178] mlx5_core 0000:5e:00.1: Port module event: module 1, Cable plugged
[ 4.566616] mlx5_core 0000:5e:00.1: mlx5_pcie_event:299:(pid 9): Detected insufficient power on the PCIe slot (27W).
[ 4.608098] mlx5_core 0000:5e:00.1: MLX5E: StrdRq(1) RqSz(8) StrdSz(2048) RxCqeCmprss(0)
[ 4.795364] mlx5_core 0000:5e:00.1: Supported tc offload range - chains: 4294967294, prios: 4294967295
[ 4.820152] mlx5_core 0000:5e:00.0 enp94s0f0np0: renamed from eth1
[ 4.863814] mlx5_core 0000:5e:00.1 enp94s0f1np1: renamed from eth0
[ 12.508506] mlx5_core 0000:5e:00.0 enp94s0f0np0: Link up
[ 13.061623] mlx5_core 0000:5e:00.1 enp94s0f1np1: Link up
[ 13.480517] mlx5_core 0000:5e:00.0: MLX5E: StrdRq(1) RqSz(8) StrdSz(2048) RxCqeCmprss(0)
[ 14.307750] mlx5_core 0000:5e:00.1: MLX5E: StrdRq(1) RqSz(8) StrdSz(2048) RxCqeCmprss(0)
lspci
5e:00.0 Ethernet controller [0200]: Mellanox Technologies MT2892 Family [ConnectX-6 Dx] [15b3:101d]
Subsystem: Mellanox Technologies MT2892 Family [ConnectX-6 Dx] [15b3:0016]
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR+ FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0, Cache Line Size: 32 bytes
Interrupt: pin A routed to IRQ 35
NUMA node: 0
IOMMU group: 86
Region 0: Memory at c2000000 (64-bit, prefetchable) [size=32M]
Expansion ROM at c5e00000 [disabled] [size=1M]
Capabilities: [60] Express (v2) Endpoint, MSI 00
DevCap: MaxPayload 512 bytes, PhantFunc 0, Latency L0s unlimited, L1 unlimited
ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset+ SlotPowerLimit 0.000W
DevCtl: CorrErr+ NonFatalErr+ FatalErr+ UnsupReq-
RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+ FLReset-
MaxPayload 256 bytes, MaxReadReq 512 bytes
DevSta: CorrErr+ NonFatalErr- FatalErr- UnsupReq+ AuxPwr- TransPend-
LnkCap: Port #0, Speed 16GT/s, Width x16, ASPM not supported
ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp+
LnkCtl: ASPM Disabled; RCB 64 bytes, Disabled- CommClk+
ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
LnkSta: Speed 8GT/s (downgraded), Width x16 (ok)
TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
DevCap2: Completion Timeout: Range ABC, TimeoutDis+ NROPrPrP- LTR-
10BitTagComp+ 10BitTagReq- OBFF Not Supported, ExtFmt- EETLPPrefix-
EmergencyPowerReduction Not Supported, EmergencyPowerReductionInit-
FRS- TPHComp- ExtTPHComp-
AtomicOpsCap: 32bit- 64bit- 128bitCAS-
DevCtl2: Completion Timeout: 260ms to 900ms, TimeoutDis- LTR- OBFF Disabled,
AtomicOpsCtl: ReqEn+
LnkCap2: Supported Link Speeds: 2.5-16GT/s, Crosslink- Retimer+ 2Retimers+ DRS-
LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- SpeedDis-
Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
Compliance De-emphasis: -6dB
LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete+ EqualizationPhase1+
EqualizationPhase2+ EqualizationPhase3+ LinkEqualizationRequest-
Retimer- 2Retimers- CrosslinkRes: unsupported
Capabilities: [48] Vital Product Data
Product Name: ConnectX-6 Dx EN adapter card, 100GbE, Dual-port QSFP56, PCIe 4.0 x16, No Crypto
Read-only fields:
[PN] Part number: MCX623106AN-CDAT
[EC] Engineering changes: AH
[V2] Vendor specific: MCX623106AN-CDAT
[SN] Serial number: XXX
[V3] Vendor specific: XXX
[VA] Vendor specific: MLX:MN=MLNX:CSKU=V2:UUID=V3:PCI=V0:MODL=CX623106A
[V0] Vendor specific: PCIeGen4 x16
[VU] Vendor specific: XXX
[RV] Reserved: checksum good, 1 byte(s) reserved
End
Capabilities: [9c] MSI-X: Enable+ Count=64 Masked-
Vector table: BAR=0 offset=00002000
PBA: BAR=0 offset=00003000
Capabilities: [c0] Vendor Specific Information: Len=18 <?>
Capabilities: [40] Power Management version 3
Flags: PMEClk- DSI- D1- D2- AuxCurrent=375mA PME(D0-,D1-,D2-,D3hot-,D3cold+)
Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [100 v1] Advanced Error Reporting
UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq+ ACSViol-
UESvrt: DLP+ SDES- TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr+
CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr+
AERCap: First Error Pointer: 04, ECRCGenCap+ ECRCGenEn- ECRCChkCap+ ECRCChkEn-
MultHdrRecCap- MultHdrRecEn- TLPPfxPres- HdrLogCap-
HeaderLog: 00000000 00000000 00000000 00000000
Capabilities: [150 v1] Alternative Routing-ID Interpretation (ARI)
ARICap: MFVC- ACS-, Next Function: 1
ARICtl: MFVC- ACS-, Function Group: 0
Capabilities: [180 v1] Single Root I/O Virtualization (SR-IOV)
IOVCap: Migration-, Interrupt Message Number: 000
IOVCtl: Enable- Migration- Interrupt- MSE- ARIHierarchy+
IOVSta: Migration-
Initial VFs: 8, Total VFs: 8, Number of VFs: 0, Function Dependency Link: 00
VF offset: 2, stride: 1, Device ID: 101e
Supported Page Size: 000007ff, System Page Size: 00000001
Region 0: Memory at 00000000c4800000 (64-bit, prefetchable)
VF Migration: offset: 00000000, BIR: 0
Capabilities: [1c0 v1] Secondary PCI Express
LnkCtl3: LnkEquIntrruptEn- PerformEqu-
LaneErrStat: 0
Capabilities: [230 v1] Access Control Services
ACSCap: SrcValid- TransBlk- ReqRedir- CmpltRedir- UpstreamFwd- EgressCtrl- DirectTrans-
ACSCtl: SrcValid- TransBlk- ReqRedir- CmpltRedir- UpstreamFwd- EgressCtrl- DirectTrans-
Capabilities: [320 v1] Lane Margining at the Receiver <?>
Capabilities: [370 v1] Physical Layer 16.0 GT/s <?>
Capabilities: [420 v1] Data Link Feature <?>
Kernel driver in use: mlx5_core
Kernel modules: mlx5_core