DriveAGX Thor ptp4l fails on mgbe3_0 when DriveWorks sample_camera started to stream video

Dear @david.cattley ,
When you notice SMMU fault on serial console, do you see sample_camera crashes or run without any issue?
In the shared sample_camera log, I don’t see any error as such, it looks like it is continuously running and you have not stopped the app (press CTRL + C) when logs are captured. Please clarify this to get insights.

From earlier in the thread:

So you are correct that sample_camera or even recorder continue to trundle along just fine. But the problem I first noticed and continue to be concerned about is that the SMMU fault then propagates to a mgbe3_0 timesync failure so PTP no longer works.

This is a fundamental issue for capturing with synchronized sensors across two (or more) DriveAGX units if they cannot be kept in sync.

This is also a fundamental issue for capturing data from image sensors with FSYNC coherency to other external sensors that are synchronized to PTP.

Are you referring to the contents of the logs in that previously attached archive?

If you are looking at the stdout/stderr spew from sample_camera then you are looking in the wrong place. It seems to be perfectly unaware that the underlying MIPI/CSI-2 ingress engine has gone off and whacked some memory address it is not permitted to access. You need to look at the CCPLEX output to see the SMMU fault.

Take the file tio-thor.txt and use your tool to demux the debug port capture on it. Look at the CCPLEX output.

you can also see the secondary impact in the file dmesg-tail.txt as it shows all of the failures logged after the fault that impact logged by nvethernet driver. Similarly the output of journactl-tail.txt shows the impact to ptp4l as well as the kernel logging from nvethernet.

The archive contains a file notes.txt that explain how the data was collected.

Since the output of sample_camera is captured with tee on a single command line, when CTRL-C is issued to the shell it delivers the signal to both sample_camera and tee so the output of sample_camera responding to the signal is likely just discarded. There is nothing particularly interesting here. All of the failure happened long before the shutdown sequence.

Dear @david.cattley ,
We ran sample_camera with V1SIM728MPRU4120ND1. Attaching the logs from our local test. We don’t see SMMU fault

Serial console log : ACM0 output.txt (120.5 KB)
Sample_camera log : sample_camera lraw log.txt (117.4 KB)
GUI output via VNC session :


Used configuration : "parameter": "camera-name=V1SIM728MPRU4120ND1,disable-custintf=1,disable-auth=1,interface=csi-ab,skip-eeprom=1,CPHY-mode=1,link=2,fifo-size=8,output-format=raw+processed,async-record=1,deserializer=MAX96724_Fusa_nv,file-buffer-size=16777216,format=lraw",

Thanks for the update.

I see that the test was on CSI AB with a MAX96724 deserializer.

I did previously test with one of the MAX96724 ports and found the symptom to be identical.

I will point out that my scenario uses CSI EF which has the MAX96712 deserializer.

I don’t know where to go from here. This reproduces easily (100% repro rate) on three systems in my control. What more can I collect and provide to you?

That means, you notice same SMMU fault even on csi-ab port? Could you check?

Will share our observations on csi-ef port soon.

I will check csi-ab again specifically and get back to you.

I setup exactly this configuration. I recorded the debug port output with tcu_muxer and captured the copied the console output from running sample_camera. All of that output along with the rig file used in in the attached ZIP.

The SMMU fault occurs as it has always done for me. This separated TCUMux output for the relevant channel is in CCPLEX7.txt:

nvhost server initialization completed
nvhost guest connect: module_id:1
nvhost guest connect: module_id:2
!!!
!!! SMMU FAULT OCURRED !!!
!!!
Event queue Interrupt triggered on Smmu Instance: 0x0
Translation Fault occurred:
StreamId: 0x2e01
Write access
Stage 2 fault occurred
Failed to fetch the Input Address
Input Address is: 0xffbec20000
IPA: 0xffbec20000
STAG: 0x0
Stall: 0x0
Data
Unprivileged
NSIPA: 0x0

Tegra Report Error
Reporter ID: : 0x8287
Error code: : 0x20002e01
Error attribute: : 0x0
Error report timestamp: : 0x149e46f6ed
!!!
!!! SMMU FAULT OCURRED !!!
!!!
Event queue Interrupt triggered on Smmu Instance: 0x0
Translation Fault occurred:
StreamId: 0x2e01
Write access
Stage 2 fault occurred
Failed to fetch the Input Address
Input Address is: 0xffbec20000
IPA: 0xffbec20000
STAG: 0x0
Stall: 0x0
Data
Unprivileged
NSIPA: 0x0

Tegra Report Error
Reporter ID: : 0x8287
Error code: : 0x20002e01
Error attribute: : 0x0
Error report timestamp: : 0x14a79a477a
Translation Fault occurred:
StreamId: 0x2e01
Write access
Stage 2 fault occurred
Failed to fetch the Input Address
Input Address is: 0xffbec20020
IPA: 0xffbec20000
STAG: 0x0
Stall: 0x0
Data
Unprivileged
NSIPA: 0x0

Tegra Report Error
Reporter ID: : 0x8287
Error code: : 0x20002e01
Error attribute: : 0x0
Error report timestamp: : 0x14ad3e604a
Translation Fault occurred:
StreamId: 0x2e01
Write access
Stage 2 fault occurred
Failed to fetch the Input Address
Input Address is: 0xffbec20020
IPA: 0xffbec20000
STAG: 0x0
Stall: 0x0
Data
Unprivileged
NSIPA: 0x0

Tegra Report Error
Reporter ID: : 0x8287
Error code: : 0x20002e01
Error attribute: : 0x0
Error report timestamp: : 0x14b3dcfbd9

2026-04-10-sample_camera-nv-v1sim728-cam4-link2-raw-processed-lraw-smmu_fault.zip (152.6 KB)