Jetson AGX Xavier both NICs use Same PTP Hardware Clock

I’m using a Jetson AGX Xavier from Connect Tech (Rogue carrier board). I’m using JetPack 4.5, L4T 32.5.

I’m trying to use PTP on both ports, one as a master and one as a slave. For the two NICs, one appears to be an Intel I210 and shows up with lspci -v and appears to use the standard igb driver. The other does not seem to be listed but Connect Tech has told me it is a Microchip PHY (ksz9031rnx) - the driver shows up in ethtool as “eqos”.

Both NICs are listed as having hardware time stamping capabilities with ethtool:

eth0 (Microchip)

steve@JetsonAGX:~$ ethtool -i eth0
driver: eqos
version: 
firmware-version: 
expansion-rom-version: 
bus-info: 2490000.ether_qos
supports-statistics: yes
supports-test: no
supports-eeprom-access: no
supports-register-dump: no
supports-priv-flags: no

steve@JetsonAGX:~$ ethtool -T eth0
Time stamping parameters for eth0:
Capabilities:
	hardware-transmit     (SOF_TIMESTAMPING_TX_HARDWARE)
	software-transmit     (SOF_TIMESTAMPING_TX_SOFTWARE)
	hardware-receive      (SOF_TIMESTAMPING_RX_HARDWARE)
	software-receive      (SOF_TIMESTAMPING_RX_SOFTWARE)
	software-system-clock (SOF_TIMESTAMPING_SOFTWARE)
	hardware-raw-clock    (SOF_TIMESTAMPING_RAW_HARDWARE)
PTP Hardware Clock: 0
Hardware Transmit Timestamp Modes:
	off                   (HWTSTAMP_TX_OFF)
	on                    (HWTSTAMP_TX_ON)
Hardware Receive Filter Modes:
	none                  (HWTSTAMP_FILTER_NONE)
	ptpv1-l4-sync         (HWTSTAMP_FILTER_PTP_V1_L4_SYNC)
	ptpv1-l4-delay-req    (HWTSTAMP_FILTER_PTP_V1_L4_DELAY_REQ)
	ptpv2-l4-sync         (HWTSTAMP_FILTER_PTP_V2_L4_SYNC)
	ptpv2-l4-delay-req    (HWTSTAMP_FILTER_PTP_V2_L4_DELAY_REQ)
	ptpv2-l2-sync         (HWTSTAMP_FILTER_PTP_V2_L2_SYNC)
	ptpv2-l2-delay-req    (HWTSTAMP_FILTER_PTP_V2_L2_DELAY_REQ)
	ptpv2-event           (HWTSTAMP_FILTER_PTP_V2_EVENT)

eth1 (Intel I210)

steve@JetsonAGX:~$ ethtool -i eth1
driver: igb
version: 5.4.0-k
firmware-version: 3.25, 0x800005d0
expansion-rom-version: 
bus-info: 0001:01:00.0
supports-statistics: yes
supports-test: yes
supports-eeprom-access: yes
supports-register-dump: yes
supports-priv-flags: no

steve@JetsonAGX:~$ ethtool -T eth1
Time stamping parameters for eth1:
Capabilities:
	hardware-transmit     (SOF_TIMESTAMPING_TX_HARDWARE)
	software-transmit     (SOF_TIMESTAMPING_TX_SOFTWARE)
	hardware-receive      (SOF_TIMESTAMPING_RX_HARDWARE)
	software-receive      (SOF_TIMESTAMPING_RX_SOFTWARE)
	software-system-clock (SOF_TIMESTAMPING_SOFTWARE)
	hardware-raw-clock    (SOF_TIMESTAMPING_RAW_HARDWARE)
PTP Hardware Clock: 0
Hardware Transmit Timestamp Modes:
	off                   (HWTSTAMP_TX_OFF)
	on                    (HWTSTAMP_TX_ON)
Hardware Receive Filter Modes:
	none                  (HWTSTAMP_FILTER_NONE)
	all                   (HWTSTAMP_FILTER_ALL)

My problem is that both NICs list PTP Hardware Clock: 0 as their PHC. Whenever I run ptp4l, I can see that both clocks use /dev/ptp0 in the output.

When I run ptp4l on eth1 with:

sudo ptp4l -m -s -i eth1 -l7

It runs fine and in the output I can see:

ptp4l[80568.812]: selected /dev/ptp0 as PTP clock

However, when I run on eth0:

sudo ptp4l -m -s -i eth0 -l7

It also selects /dev/ptp0, seems to run, and ptp messages appear to be sent and received correctly, but the PHC does not update, and the offset just grows:

ptp4l[81052.328]: selected /dev/ptp0 as PTP clock
ptp4l[81052.331]: port 0: INITIALIZING to LISTENING on INIT_COMPLETE
ptp4l[81052.331]: port 1: received link status notification
ptp4l[81052.331]: interface index 3 is up
ptp4l[81052.605]: port 1: setting asCapable
ptp4l[81053.440]: port 1: new foreign master 000c8b.fffe.c07405-4
ptp4l[81057.456]: selected best master clock e415f6.fffe.f71493
ptp4l[81057.458]: port 1: LISTENING to UNCALIBRATED on RS_SLAVE
ptp4l[81057.630]: port 1: delay timeout
ptp4l[81058.408]: port 1: delay timeout
ptp4l[81058.410]: delay   filtered      17714   raw      17714
ptp4l[81058.657]: master offset -36353171747 s0 freq  +19748 path delay     17714
ptp4l[81059.665]: master offset -36353216999 s1 freq  -25146 path delay     17714
ptp4l[81059.901]: port 1: delay timeout
ptp4l[81060.181]: port 1: delay timeout
ptp4l[81060.673]: master offset -36353262323 s2 freq -62499999 path delay     17714
ptp4l[81060.673]: port 1: UNCALIBRATED to SLAVE on MASTER_CLOCK_SELECTED
ptp4l[81060.713]: port 1: delay timeout
ptp4l[81060.715]: delay   filtered       9490   raw       1266
ptp4l[81061.268]: port 1: delay timeout
ptp4l[81061.269]: delay   filtered      13506   raw      13506
ptp4l[81061.681]: master offset -36353302491 s2 freq -62499999 path delay     13506
ptp4l[81062.459]: port 1: delay timeout
ptp4l[81062.461]: delay   filtered      15610   raw      18142
ptp4l[81062.690]: master offset -36353350695 s2 freq -62499999 path delay     15610
ptp4l[81063.006]: port 1: delay timeout
ptp4l[81063.008]: delay   filtered      13506   raw       7602
ptp4l[81063.697]: master offset -36353394759 s2 freq -62499999 path delay     13506
ptp4l[81063.847]: port 1: delay timeout
ptp4l[81063.849]: delay   filtered      10554   raw       3744
ptp4l[81064.129]: port 1: delay timeout
ptp4l[81064.132]: delay   filtered      10154   raw      10154
ptp4l[81064.705]: master offset -36353437279 s2 freq -62499999 path delay     10154

So it seems to me that /dev/ptp0 is the PHC for eth1 and /dev/ptp1 is the PHC for eth0. Unfortunately, when I run ptp4l on eth0, it tries to use /dev/ptp0 and it seems as though it cannot access or cannot update that PHC.

So one workaround that I found was to upgrade to linuxptp v3.0 (built from source). With this, I can specify the PHC when running ptp4l with the -p flag as such:

sudo ptp4l -m -s -i eth0 -p /dev/ptp1 -l7

Now this works, but the offset is not very stable - or at least not nearly as stable as running ptp4l with the Intel I210 NIC:

ptp4l[81744.924]: port 1: taking /dev/ptp1 from the command line, not the attached ptp0

The offset bounces around in the +/-1500ns whereas on the Intel I210, the offset is usually stable within +/-15ns. So when forcing eth0 to use /dev/ptp1 the synchronization is about two orders of magnitude worse than using eth1 with /dev/ptp0.

So here are my questions:

How do I know which PHC (/dev/ptp*) is on which NIC (eth*)?

Why are both NICs (eth0 and eth1) registering the same PHC when I run ethtool -T?

Is this an issue in the device tree? Is there a way to change which PHC is assigned to which NIC?

And even when I force eth0 to use /dev/ptp1 with a newer version of ptp4l, why is the performance so bad compared to using eth1 with /dev/ptp0?

I’m familiar with the igb driver, but not with the eqos driver - could this be a driver issue? Is there any way to update the eqos driver or NIC firmware?

Any help would be greatly appreciated! Thanks!

Sorry fo the late respones, we will do the investigation to have the update soon.

Looks like I can confim that /dev/ptp1 is the ptp clock for eth0 (Microchip NIC). I checked /sys/class/ptp/ptp1/clock_name and saw:

steve@JetsonAGX:/$ cat /sys/class/ptp/ptp1/clock_name 
eqos_clk

Also, I heard back from ConnectTech and they seem to have found a bug in the eqos driver that always reports the ptp index as “0”. They offered a fix and kernel patch and said they would reach out to you guys to have it fixed in a future release.

Unfortunately that still doesn’t answer the question of why the offset on /dev/ptp1 is two orders of magnitude worse than using the Intel NIC (eth1) and /dev/ptp0. Connect Tech seems to think that there may be additional latency incurred by having the RGMII link between the MAC and PHY on the Microchip NIC. Unfortunately this is pushing far past my knowledge of the subject, so I can’t really comment.

If there is latency added by the RGMII link, is that latency deterministic? Would it be possible to account for this latency in ptp4l with the “ingressLatency” and egressLatency" config values?