MGBE PTP Timestamp Timeout

Please provide the following info (tick the boxes after creating this topic):
Software Version
DRIVE OS 6.0.6
DRIVE OS 6.0.5
DRIVE OS 6.0.4 (rev. 1)
DRIVE OS 6.0.4 SDK
other

Target Operating System
Linux
QNX
other

Hardware Platform
DRIVE AGX Orin Developer Kit (940-63710-0010-300)
DRIVE AGX Orin Developer Kit (940-63710-0010-200)
DRIVE AGX Orin Developer Kit (940-63710-0010-100)
DRIVE AGX Orin Developer Kit (940-63710-0010-D00)
DRIVE AGX Orin Developer Kit (940-63710-0010-C00)
DRIVE AGX Orin Developer Kit (not sure its number)
other

SDK Manager Version
1.9.3.10904
other

Host Machine Version
native Ubuntu Linux 20.04 Host installed with SDK Manager
native Ubuntu Linux 20.04 Host installed with DRIVE OS Docker Containers
native Ubuntu Linux 18.04 Host installed with DRIVE OS Docker Containers
other

Hello,

We are running a PTP instance on an mgbe interface using linuxptp-1.3.1 on the automotive-master and automotive-slave configurations, and quite frequently see issues where ptp4l reports ‘tx_timestamp_timeout’ issues that result in the ptp4l instance faulting and being restarted. We have increased tx_timestamp_timeout to 1000ms (which seems egregious) and the number of instances where that happens has drastically reduced, but it still appears to crop up every now and again. Can you please say whether this is a known issue and if so, if there is a known fix and/or driver update planned to fix it?

Thanks!

Could you please provide some background information about your setup? Specifically:

Why have you chosen to use linuxptp-1.3.1?
Could you explain the purpose and configuration of ‘automotive-master’ and ‘automotive-slave’ setups in your context? Are they from our documentation or a specific source?

Linuxptp-3.1.1 was chosen simply because at the time it was the most up-to-date version of linuxptp. I believe there has been an update to the version since it was selected, but the changelog does not mention any changes that would affect this behavior. Does NVIDIA have a specific version of Linuxptp they recommend? The specific version was not mentioned in the documentation I had read.

The configuration files are as provided by the linuxptp project. Again, the documentation NVIDIA provides does not appear to address any specific configurations other than using automotive-master and automotive-slave, and does not mention this tx_timestamp_timeout modification at all. If there is a specific configuration that should be used, please point it out.

Thanks!

Please see if the following documentation helps:

In addition to that, it would be immensely helpful if you could provide more details about your setup and the steps you’ve taken so far in your testing process.

Hello!

Yes, those are the pages I have referenced. Under Orin Time Sync, under heading AVNU PTP Support, it mentions the commands necessary for ptp synchronization and mentions the config files used to achieve this, but makes no mention of what version of linuxptp is used, and the base image does not appear to have linuxptp installed. Further, it does not mention the location of the automotive-master and automotive-slave config files used in the demo, nor does it mention any specific modifications necessary to the config files necessary for use in this system. So, I ask again - what version of linuxptp does NVIDIA recommend, and is there documentation on any modification needed for the config files for use in an Orin system?

As for the specific setup, I am running precisely the commands listed in the documentation, save using mgbe1_0 instead of 2_0.

Thanks

Isn’t MGBE-1 marked as an PTP-disabled port in the first document shared?

It is in the sense that the default configuration of the Orin does not use the port for PTP, however, nothing in the documentation suggests that it is incapable of being a PTP master or slave, and in fact multiple notes inside the device tree configuration guides explicitly state that mgbe1_0 is supported as a PTP-compliant port.

Could you please specify where I can find the device tree configuration guides that mention mgbe1_0 as a supported PTP-compliant port? This will help me verify this information with our team.

I encourage you to read the notes in Documentation/devicetree/bindings/platform/tegra/tegra-nvethernet.txt as referenced by the PTP Bridging in Orin section in Orin Time Sync.

More to the point, however, I feel I must ask - is it NVIDIA’s official stance that mgbe1_0 is not supported as a PTP interface?

Thanks

Could you share the exact ptp4l command run on mgbe1_0 and its complete output regarding the ‘tx_timestamp_timeout’ issues?

Command:

ptp4l -i mgbe1_0 -m -f /opt/linuxptp-3.1.1/configs/automotive-slave.cfg

Output (after a random amount of time):

...
Sep 07 18:26:47 tegra-ubuntu ptp4l[4668]: ptp4l[3987.290]: timed out while polling for tx timestamp
Sep 07 18:26:47 tegra-ubuntu ptp4l[4668]: ptp4l[3987.290]: increasing tx_timestamp_timeout may correct this issue, but it is likely caused by >
Sep 07 18:26:47 tegra-ubuntu ptp4l[4668]: ptp4l[3987.290]: port 1: send sync failed
Sep 07 18:26:47 tegra-ubuntu ptp4l[4668]: ptp4l[3987.290]: port 1: MASTER to FAULTY on FAULT_DETECTED (FT_UNSPECIFIED)
Sep 07 18:26:47 tegra-ubuntu ptp4l[4668]: [3987.290] timed out while polling for tx timestamp
Sep 07 18:26:47 tegra-ubuntu ptp4l[4668]: [3987.290] increasing tx_timestamp_timeout may correct this issue, but it is likely caused by a driv>
Sep 07 18:26:47 tegra-ubuntu ptp4l[4668]: [3987.290] port 1: send sync failed
Sep 07 18:26:47 tegra-ubuntu ptp4l[4668]: [3987.290] port 1: MASTER to FAULTY on FAULT_DETECTED (FT_UNSPECIFIED)
...

Hello,

A quick tweak, we are using linuxptp-3.1.1, not -1.3.1. A small but important difference.

We’re still checking this issue with our team. Could you please provide more context or clarification as to why ‘mgbe1_0’ was chosen instead of ‘mgbe2_0,’ as specified in the document?

Hello,

Our network configuration is such that connection via mgbe2_0 is not possible.

Could you please provide more details about the specific limitations or challenges in your network configuration that make a connection via ‘mgbe2_0’ not possible? Understanding developers’ real-world use cases is valuable in assisting with practical solutions.

Unless it is necessary for continued support from NVIDIA, we would prefer not to share additional details on this forum detailing our system architecture.

Please try using the following commands on two devkits connected with ethernet and synchronized with ptp4l:

First unit Command:
/usr/sbin/ptp4l -i mgbe1_0 -p /dev/ptp4 -m -4 -l7 -f /etc/linuxptp/ptp4l-orin.conf

Second unit Command:
/usr/sbin/ptp4l -i mgbe1_0 -p /dev/ptp4 -m -4 -s -f /etc/linuxptp/ptp4l-orin.conf

Neither those executables nor that config file exists on a stock 6.0.6 install. Also, the -4 flag indicates this is a ipv4 configuration, incompatible with an 802.1as or gptp configuration desired.

Please refer to PTP Bridging in Orin and locate the required files under the following directory path:

~/nvidia/nvidia_sdk/DRIVE_OS_6.0.6_SDK_Linux_DRIVE_AGX_ORIN_DEVKITS/DRIVEOS/drive-linux/samples/nvavb/daemons

These files should be available in the specified directory on your host system.

Hello,

I was able to identify the files in question, of note is that the configs on the NVIDIA side do indeed contain the tx_timestamp_timeout override to 1000ms, which as discussed does drastically reduce the occurrence tx timestamp timeout errors, but does not remove it. Can NVIDIA speak to why this change is necessary and if NVIDIA has any plans to release a driver fix?