HCA extended port counters

Are 64bit HCA port counters (ie. /sys/class/infiniband/ on linux systems) a feature specific to mellanox ?

or is it more generic to OFED ?

Do you know when extended port countes were introduced ?

thanks,

Chris Hunter

counters_ext is not upstream and is MOFED only.

If a device PMA supports the extended port counters (which is your case), it depends on which kernel is being used. There were recent kernel changes to utilize the optional PortCountersExtended rather than the mandatory PortCounters. So either a recent kernel with these changes would be needed to see this or the relevant changes backported to some older kernel.

At least in upstream kernel, I only see mandatory PortCounters support in /sys/class/infiniband but didn’t chase every HCA driver to see if they augment this.

Which HCA(s) and which OFED or MOFED are you using ? What is an example of the specific counter name you seem for some HCAs that you don’t see for others ?

on my machine, perfquery -x returns 64-bit values for the port counters, but i am unable to determine where these counters are. e.g. the counter /sys/class/infiniband/mlx4_0/ports/1/counters/port_rcv_data is only a 32-bit value and is maxed out at 4294967295. according to mlx5 docs there should be a counters_ext directory but that is not present on my system. is there a way to enable that with mlx4 or how am i to get the correct value?

These are Infiniband performance counters, they are not specific to Mellanox drivers.

I believe they were around for at least 7 or 8 years if not longer.

Which /sys/class/infiniband counters are you referring to ? I suspect that you’re asking about the ones in PortExtendedCounters attribute (which include 64 bit versions of the data counters in PortCounters and some unicast/multicast ones). If so, this is optional on a per device basis and these only show up for devices which support them. I don’t have a list of all Mellanox HCAs and switches as to whether or not they are supported but it’s easy to tell (via perfquery -x (or --extended)) directed at port in question. If it responds with counters, then they should show up in /sys/class/infiniband on the machine which contains that port.

Yes exactly. There seems to be hardware and OFED version dependency whether the PortExtendedCounters attribute is exposed. I haven’t discovered the pattern yet. Sometimes “perfquery --extended” will show these parameters even if they are not exposed under sysfs (ie. /sys/class/infiniband).

We are using CentOS 6.8 with kernel 2.6.32-642.11.1.el6.x86_64 (latest available), and the CentOS mlx4 kernel modules (I tried using OFED but they wouldn’t support NFSoRDMA)

The changes for this are relatively recent and went into some 4.x kernel.

See some examples here for mlx5, I’m not sure if this is the same for mlx4.

Understanding mlx5 Linux Counters and Status Parameters https://community.mellanox.com/s/article/understanding-mlx5-linux-counters-and-status-parameters

I tried using MOFED drivers but the NFS/RDMA module was not working properly so I reverted back to CentOS drivers. Is it possible to use parts of MOFED to enable these counters but still use CentOS drivers for NFS/RDMA?

I don’t know how to mix and match CentOS and MOFED CentOS bits nor if that’s even possible. It should be possible to update your CentOS for the 64 bit counter support. It’s a few patches to drivers/infiniband/core and a rebuild…