Inconsistent hardware timestamping? ConnectX-5 EN & tcpdump

Hi all,

We recently purchased a MCX516A-CCAT from the Mellanox webstore, but encountered the following issue when trying to do a simple latency measurement, using hardware timestamping.

Using the following command to retrieve system timestamps:ip netns exec ns_m0 tcpdump --time-stamp-type=host --time-stamp-precision=nano

Which gives the following results (for example):

master/server****slave/client19:36:03.883442258 IP15 19:36:03.883524725 IP15 19:36:03.883678497 IP15 19:36:03.883703809 IP15 19:36:03.883924377 IP15 19:36:03.883939231 IP15 19:36:03.883971437 IP15 19:36:03.883985143 IP15 19:36:03.884010765 IP15 19:36:03.884021139 IP15 19:36:03.884051422 IP15 19:36:03.884062029 IP15 19:36:03.884083780 IP15 19:36:03.884091661 IP15 19:36:03.884127283 IP15 19:36:03.884135654 IP15 19:36:03.884159177 IP15 19:36:03.884167900 IP15 19:36:03.884187810 IP15 19:36:03.884197308 IP15 ``19:36:03.883379688 IP15 19:36:03.883590507 IP15 19:36:03.883659403 IP15 19:36:03.883716669 IP15 19:36:03.883914510 IP15 19:36:03.883947770 IP15 19:36:03.883961851 IP15 19:36:03.883994953 IP15 19:36:03.884005137 IP15 19:36:03.884030823 IP15 19:36:03.884046094 IP15 19:36:03.884068390 IP15 19:36:03.884078674 IP15 19:36:03.884100314 IP15 19:36:03.884119333 IP15 19:36:03.884141135 IP15 19:36:03.884152060 IP15 19:36:03.884173955 IP15 19:36:03.884182438 IP15 19:36:03.884203057 IP15

This is expected, timestamps are in chronological order. About the traffic: small and equal packets are bounced back-and-forth. Client initiates traffic generation. So for the client the odd numbered timestamps are outgoing and vice-versa for the server.

But now, when using hardware timestamping, we get the following (for example):

ip netns exec ns_m0 tcpdump --time-stamp-type=adapter_unsynced --time-stamp-precision=nano

master/server

slave/client

14:44:04.710315788 IP15 14:44:04.758545873 IP15 14:44:04.710567282 IP15 14:44:04.758799830 IP15 14:44:04.710849394 IP15 14:44:04.759069396 IP15 14:44:04.711042879 IP15 14:44:04.759236686 IP15 14:44:04.711141554 IP15 14:44:04.759281897 IP15 14:44:04.711184281 IP15 14:44:04.759324535 IP15 14:44:04.711224345 IP15 14:44:04.759364437 IP15 14:44:04.711266610 IP15 14:44:04.759406555 IP15 14:44:04.711310310 IP15 14:44:04.759449711 IP15 14:44:04.711349465 IP15 14:44:04.759488431 IP15 ``14:44:04.758411898 IP15 14:44:04.710425435 IP15 14:44:04.758680982 IP15 14:44:04.710662581 IP15 14:44:04.758963612 IP15 14:44:04.710928565 IP15 14:44:04.759157087 IP15 14:44:04.711098779 IP15 14:44:04.759261251 IP15 14:44:04.711140994 IP15 14:44:04.759302503 IP15 14:44:04.711182978 IP15 14:44:04.759344893 IP15 14:44:04.711223669 IP15 14:44:04.759384802 IP15 14:44:04.711267547 IP15 14:44:04.759428520 IP15 14:44:04.711308661 IP15 14:44:04.759469128 IP15 14:44:04.711351810 IP15

Now we can see that the timestamps are not chronological (see nanosecond portions). Which is unexpected, and is making the latency measurement impossible (as far as I can see). I expect both ports to be on their own clocks, this does not appear to be the case however (clock for RX and a clock for TX, instead of clock per port). Is there a solution to this? Must I use socket ancillary data in a custom C application to receive the correct timestamps? I’ll put information on the setup below. Please let me know if more information is needed. Note: Applications like linuxptp do seem to work fine with hardware timestamping, and gives a path-delay in the sub-microsecond range.

The setup:

CentOS Linux 7.

Kernel 3.10.0-862.14.4.el7.x86_64 (default kernel for CentOS 7.5 installation).

Mellanox OFED, latest firmware & drivers.

Using network namespaces ns_m0 with ens6f0 and ns_m1 with ens6f1 to prevent kernel loopback.

Hi Jillis,

In order to better assist you , i suggest to open a support ticket at support@mellanox.com mailto:support@mellanox.com to further discuss this matter.

Thanks,

Samer