Lately we experienced crashes/shutdowns when building tensorRT engines with trtexec in MAXN mode on Jetpack 4.6.1. During the build, the Xavier suddenly shuts down with no notable log entries. (attached kern.log, sys.log and trtexec verbose log, timestamp ~ 14:07:06)
We have to power it on again after that by pressing the power button.
The problems only occur in MAXN mode.
We had some similar problems when running pytorch scripts on Jetpack 4.6. An upgrade to 4.6.1 solved these issues.
(Jetson AGX Xavier MAXN Mode crashes - #18 by vovea)
Hi,
you mean R34.1.1 ?
I got the opportunity today to test the engine build on an AGX Orin (R34.1.1), which went well - but I am not sure if the comparison between the AGX Xavier and AGX Orin makes a lot of sense.
Since we only have one Xavier device, and it runs software from multiple people, it usually takes us longer for version upgrades.
Here is the ONNX, which crashes on engine-build on the AGX Xavier: ONNX model
Here is the kern.log from today’s AGX Xavier crash: It occured at ~13:48:37, during the trt engine build. It seems that the device has rebooted itself. kern.log (102.7 KB)
Dear @vovea,
I don’t see any issue with latest release( Jetpack 5.0.1) on Jetson Xavier. Could you consider upgrading to latest release to avoid this issue and to get TensorRT features.
Hello,
thanks for the info. Were you able to reproduce the issue with Jetpack 4.6.1? I ask because we see similar problems on our Jetson Nano where no Jetpack 5 is available.
Hi,
it turned out, that the problem was a shunt (0.1 Ohm) for power measurements we put into the voltage supply line to the carrier board. However, it is not clear to us, how this shunt can cause the mentioned issue.
Is it possible, that the voltage drop across the shunt can trigger the DV/Dt circuit, which sets the VDDIN_PWR_BAD signal and initiates a shutdown? If so, should not be this a clean shutdown with mentions in the logfile?