R_Switch-2 PTP L2 traffic isolation

DRIVE OS Version: 7.0.3

Issue Description: I have connected two PTP L2 clients (lidars) to Thor J2 ports P1, P4. I am running the default nv_ptp4l_master_mgbe2_0.service with default config.

When I start it up PTP master on mgbe2_0 goes to FAULTY state due to “rouge peer delay response”.

Service periodically logs:

port 1 (mgbe2_0): rogue peer delay response

port 1 (mgbe2_0): MASTER to FAULTY on FAULT_DETECTED (FT_UNSPECIFIED)

When I do tcpdump for “ether proto 0x88f7” then I see that the lidars see each others Peer_Delay_Req messages and are responding.

As far as I understand Thor switches should be PTP aware (in P3960-TS3) which should mean that they are only routing these Peer_Delay_Req to the port designated as “PTP client” in the “PTP Topology and Port Roles” diagram.

Dear @keijo.kuusmik ,
Could you please share the connection picture, commands and the full logs used in your test?

I attached pictures of where exactly the lidars are connected. In this setup I have 3 of them connected to J2 ports P1, P3, P4.

I use the following command see PTP master logs on MGBE-2: journalctl -u nv_ptp4l_master_mgbe2_0.service -f

Jan 30 09:37:16 thor-1 ptp4l[2608]: [3242.480] port 1 (mgbe2_0): FAULTY to MASTER on INIT_COMPLETE
Jan 30 09:37:17 thor-1 ptp4l[2608]: [3243.390] port 1 (mgbe2_0): rogue peer delay response
Jan 30 09:37:17 thor-1 ptp4l[2608]: [3243.390] port 1 (mgbe2_0): MASTER to FAULTY on FAULT_DETECTED (FT_UNSPECIFIED)
Jan 30 09:37:33 thor-1 ptp4l[2608]: [3259.488] port 1 (mgbe2_0): FAULTY to MASTER on INIT_COMPLETE
Jan 30 09:37:34 thor-1 ptp4l[2608]: [3260.408] port 1 (mgbe2_0): rogue peer delay response
Jan 30 09:37:34 thor-1 ptp4l[2608]: [3260.408] port 1 (mgbe2_0): MASTER to FAULTY on FAULT_DETECTED (FT_UNSPECIFIED)
Jan 30 09:37:50 thor-1 ptp4l[2608]: [3276.512] port 1 (mgbe2_0): FAULTY to MASTER on INIT_COMPLETE
Jan 30 09:37:51 thor-1 ptp4l[2608]: [3277.425] port 1 (mgbe2_0): rogue peer delay response
Jan 30 09:37:51 thor-1 ptp4l[2608]: [3277.425] port 1 (mgbe2_0): MASTER to FAULTY on FAULT_DETECTED (FT_UNSPECIFIED)

nv_ptp4l_master_mgbe2_0.service is unmodified as provided by Nvidia.

I used the following command to capture PTP traffic: tcpdump -i mgbe2_0 ‘(ether proto 0x88f7) or (udp port 319 or udp port 320)’ -w ptp.pcap

I have attached sample ptp.pcap file that I captured.

Note that the lidars are connected directly to HMTD connected, no extra network equipment in between and they are running PTP client (IEEE_802_1AS_AUTOMOTIVE profile).

If the R_SWITCH-2 in Thor was PTP aware then I would expect the PTP traffic from the lidars to be restricted to only switch-lidar link, which doesn’t appear to be the case.

One additional detail to mention - Thor has MCU command getnetworkconfig/setnetworkconfig. It appears initially it has config 0x3 Invalid configuration . I tried to change it also to 0x1 BASE and 0x2 BASE_SAFETY but none of these configs appeared to make a difference to the PTP behavior of the switch.

ptp.pcap.zip (2.0 KB)

Hello @SivaRamaKrishnaNV ! Any updates on the subject?

Hi folks.

We’ll greatly appreciate support with this issue.

Dear @boris.vnukov ,
My apologies for delay. I am checking internally on this issue and update you.

Dear @keijo.kuusmik ,
Could you confirm the FW version using P3960_RTL9072D_CONDOR2.sh . Please see Switch Firmware Management — NVIDIA DriveOS 7.0.3 Linux SDK Developer Guide

@SivaRamaKrishnaNV I ran the command P3960_RTL9072D_CONDOR2.sh --GetCurrentVersion and it returns v2.1.0.200924.190301

Could you confirm if the service is running by default after reboot with
systemctl status nv_ptp4l_master_mgbe2_0.service or needs manual start?

No need for further debugging, we managed to resolve the switch PTP behavior by upgrading switch firmware to latest included in DriveOS.

Dear @keijo.kuusmik ,
Thank you for update. What is the FW version now?

Could you clarify this as well?

Yes, in our setup nv_ptp4l_master_mgbe2_0.service was running by default after reboot.

I updated to the latest I found in DriveOS and in the online docs which is:

Condor 1 - P3960_RTL907xDx_v2.2.0.191224.180301.bin
Condor 2 - P3960_RTL907xDx_v2.2.0.191224.190301.bin
Marvell - P3960_88Q5152_flash.v0004.07007.0023.01.bin

Might have been that not all of these were needed but just for completeness all were updated to latest versions that were available.

Yes. that is the expected behavior and I am assuming you have not run it again.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.