Fatal error on DAGX

My Drive AGX encounters a fatal error and Tegra-X1 does not start.
This AGX has never flushed, it is within one week of starting to use from the state it arrived.
Is it caused by HW?

version result:

shell> version
Info: Executing cmd: version, argc: 0, args: 
SW Version: DRIVE-V5.0.10-E3550-EB-Aurix-With3LSS-ForHyperion-3.00.07
Compilation date: Jul 20 2018, 09:08:33
Command Executed

l3_getsafetystatus result:

shell> l3_getsafetystatus
Info: Executing cmd: l3_getsafetystatus, argc: 0, args: 

Platform FuSa State: UNSAFE STATE
Platform Startup Status: STARTUP FAIL
Tegra A FuSa State: INIT STATE
Tegra B FuSa State: SAFE STATE
Tegra A nSAFE State: nSAFE ASSERTED
Tegra B nSAFE State: nSAFE DEASSERTED
Platform Error Log Count: 1

l3_getdriveerrorlog result:

shell> l3_getdriveerrorlog
Info: Executing cmd: l3_getdriveerrorlog, argc: 0, args: 

Tegra x1 :
Error 1: 201f004

Tegra x2 :
Zero errors logged.

MCU :
Zero errors logged.

ttyUSB2 is showing following error:

711 ÿÿ^@ÿÿþ 6535451|HV/c0: ÿÿ^@ÿÿþCPU:0, Error:CBBNOCAXI^M
....

I’ll attach error log for ttyUSB2 by tegrareset command.
tegra-a-ttyUSB2-log.txt (37 KB)

Dear yk-fujii,

Sorry for the inconvenience.
Could you please share the board serial number? Thanks.

Hi SteveNV,

Thanks for response.
Serial number is E3550-B03-S0859.

Dear yk-fujii,

Thank you for your update.
Could you please help to update PDK and Aurix FW?
And then please let me know if the problem is reproduced. Thanks.

Hi SteveNV,

I seemed to be able to flash well and AGX started up.
Thanks for suggestion!

I have some questions related to this problem:

  • Could you guess the cause of this problem ? Is the cause attributable to GuestOS? HyperVisor?

  • I changed systemd script such as “nv_hyperion_net_init.sh” in order to change the network settings.
    Is it related to this issue?
     (I was setting 10.42.0.28 to eth0 in consideration of communicating with aurix)

  • If there is a problem like this time there is no means other than flash?
    (way to mount eMMC and restore the settings, etc?)

  • I have encountered sudden error occurs when turning on/off the power. This is already the second time.
    I am considering that I should to pay special attention to power off/on.
    I understand poweroff from aurix through serial, but there is no way to shut down with AGX alone?

Thank you for your prompt update.
-. I think we need to check more to get root cause.
-. Sorry I didn’t check the network setting with old version PDK.
According to the network info through Aurix terminal on new PKD(5.0.13.2), 10.42.0.28 is for TegraA IP-address like below.

shell> version
Info: Executing cmd: version, argc: 0, args: 
<b>SW Version: DRIVE-V5.0.13-E3550-EB-Aurix-ForHyperion-3.01.05</b>
Compilation date: Nov 12 2018, 12:50:16
Command Executed
shell> status
Info: Executing cmd: status, argc: 0, args: 
Alive       : 65:43:19
CPU load     Core 0: 2%
CPU load max Core 0: 4%
CPU load     Core 1: 0%
CPU load max Core 1: 0%
CPU load     Core 2: 0%
CPU load max Core 2: 0%
CPU load     Core 3: 0%
CPU load max Core 3: 0%
CPU load     Core 4: 0%
CPU load max Core 4: 0%
CPU load     Core 5: 0%
CPU load max Core 5: 0%

Hardware information: 
SystemUpInit-Time[ms]: 79


<b>IP-address (Tegra-A): 10.42.0.28</b>
IP-address (Tegra-B): 10.42.0.29
IP-address   (AURIX): 10.42.0.146

MAC-address (Tegra-A): 0x000000044B9B954D
MAC-address (Tegra-B): 0x000000044B9B954E
MAC-address   (AURIX): 0x000000044B9B954F

RAM Usage: 1073792704 bytes
Command Executed

According to the network configuration, 10.42.0.28 is assigned br0.200 like below.

nvidia@tegra-ubuntu:~$ ifconfig -a
br0       Link encap:Ethernet  HWaddr 2e:eb:18:71:3e:33
          inet addr:10.19.11.206  Bcast:10.19.11.255  Mask:255.255.255.0
          inet6 addr: fe80::2ceb:18ff:fe71:3e33/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:131156 errors:0 dropped:0 overruns:0 frame:0
          TX packets:13013 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:11597291 (11.5 MB)  TX bytes:1400789 (1.4 MB)

br0:400   Link encap:Ethernet  HWaddr 2e:eb:18:71:3e:33
          inet addr:192.168.1.200  Bcast:192.168.1.255  Mask:255.255.255.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1

br0:900   Link encap:Ethernet  HWaddr 2e:eb:18:71:3e:33
          inet addr:10.1.0.81  Bcast:10.1.255.255  Mask:255.255.0.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1

[b]br0.200   Link encap:Ethernet  HWaddr 2e:eb:18:71:3e:33
          inet addr:10.42.0.28  Bcast:10.42.0.255  Mask:255.255.255.0[/b]
          inet6 addr: fe80::2ceb:18ff:fe71:3e33/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:8731 errors:0 dropped:0 overruns:0 frame:0
          TX packets:10751 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:582132 (582.1 KB)  TX bytes:998008 (998.0 KB)

-. If you can access TegraA via TegraB with ssh command like below maybe it is possible to re-change the setting value.
on TegraB console
$ssh nvidia@10.42.0.28

-. Powering On/Off the Device (Please refer to DRIVE OS → Setting Up Your Board → Powering On/Off the Device part in the DRIVE™ Software Documentation (ZIP))

Use these procedures to power on and power off the DRIVE Development Platform device.
To power on the DRIVE platform device
1. Connect the target to the host system.
For information about connecting the DRIVE board, see Setting Up Your Platform.
2. Plug the power supply to the AC outlet.
3. Turn the power switch on the power supply to ON position.
The DRIVE platform starts and launches a web browser displaying a welcome page. The welcome page provides links to example applications on the desktop and useful links.
To power off the DRIVE platform device
1. Shutdown the operating system by executing the command:
shutdown -h now
2. Power off the AURIX firmware, using the steps in Flashing the Firmware.
3. Unplug the power supply.