Reported rxpower in mlxlink and mlxcables

Hello,

Does someone know how the mlxlink and mlxcables differ in terms of the reported RX power?

  Device Type:      ConnectX5
  Part Number:      MCX516A-CCA_Ax
  Description:      ConnectX-5 EN network interface card; 100GbE dual-port QSFP28; PCIe3.0 x16; tall bracket; ROHS R6
  PSID:             MT_0000000012

Please see below:

[root@PC-X ~]# mlxlink -d 65:00.1 -m|grep -P 'Firmware Version|MFT Version|Rx Power Current|Temperature'
Firmware Version                : 16.31.1014
MFT Version                     : mft 4.22.1-307
Temperature [C]                 : 32 [-4..74]
Rx Power Current [dBm]          : 4,4,4,2 [-8..8]

vs

[root@PC-X ~]# mlxcables -d 65:00.1_cable_1 -DDM | grep -P 'RX Power\s:|Temperature\s{2,}'
Temperature    : 32C
	RX Power : 3.4368dBm
	RX Power : 3.1739dBm
	RX Power : 3.1269dBm
	RX Power : 2.9190dBm

It seems to me that both commands do report the four channels separately but the 4s instead of 3s look strange to me. At least they do not seem like rounding etc. issue?

Similarly thing can be seen here:

root@PC-X ~]# mlxlink -d 65:00.1 --cable --ddm|grep -iP "^rx power"
RX Power                        : 4.000dBm      ,4.000dBm      ,4.000dBm      ,2.000dBm

Also, is anyone aware what is the difference in terms of the methods used to collect the above information? mlxcables takes far more time to execute than mlxlink, almost as if the information produced by mlxlink was averaged/cached somehow, potentially explaining the differences in the reported numbers?

Any help is greatly appreciated.


Vesa

Hi Vesa,
mxlink and mlxcable get the rx power from differnet way
mlxlink is a result after FW translate
mlxcables is raw counters
Thanks,
Suo

Hi,

Thanks for this detail concerning how the information gets collected.

Now knowing that the mlxlink is somehow parsing/mangling between the HW-FW and the numbers presented to the user, it would be helpful to understand the logic in the reported values:

Mlxlink appears to present integers that do not exactly match to what classical rounding logic might dictate - or I am misinterpreting the numbers somehow:

mlxlink: 4,4,4,2 compare to mlxcables: 3.4368, 3.1739, 3.1269, 2.9190

I wonder if there is some further documentation available that could explain the difference and whether I should just stick to the slower “mlxcables -d $device --DDM” -query for monitoring purposes? In its current state, the numbers shown by mlxlink appear bit difficult digest for precise monitoring purposes.

Any help is greatly appreciated,


Vesa

Hi Vesa,
According to my knowledge, there is no document for this.
I get the difference by checking the code.
Anyhow, for any component, its power consumption should have a range. not a static value.
So both mlxlink and mlxcables can be used to monitor if the component works fine.
Thanks,
Suo

Thanks,
Suo

Hello,

Thank you for your thought on this matter.

I could be wrong but to me the numbers presented by mlxlink appear as something that I would not put past the realm of possibility that they are buggy or is there maybe a reason why mlxlink couldn’t produce the same float -style representation of the dBm as what mlxcables does?

At the moment it seems as if calls like int() and ceil() were applied on arbitrary basis on the same numbers as what is shown by mlxcables- that is, assuming that the dBm values produced by mlxlink are representations of the same optical power levels shown by mlxlink?

At the moment I can accomplish what we acutely need with mlxcables but the slower execution time of mlxcables compared to mlxlink is unfortunate, especially as the otherwise very useful output of mlxlink is impacted by the integer representation of the dBm. Surprisingly mlxlink --cable --ddm represents the same integers with three decimal places but it is not much more useful.

Thanks,


Vesa