Issues: Jetson Thor Random Shutdown

Hi all, we are working with a Jetson Thor Developer Kit on JetPack 7.0 (L4T R38.2.0). I have two separate issues.

  1. When I attach accessories such as a mouse, keyboard, and monitor, it will randomly shutdown within 1-3 minutes. One time I was able to set up a static ip address to ssh in headless and now I run into the second issue.
  2. When we try to bring up the onboard Ethernet, all mgbeX_0 interfaces appear but none will establish a link. ip link show always reports <NO-CARRIER> and state DOWN even when connected directly to another device. dmesg shows the MACs initializing but failing to talk to any PHY:
nvethernet a808a10000.ethernet: failed to read MDIO address
nvethernet a808a10000.ethernet: failed to get phy reset gpio error: -2
nvethernet ... mgbeX_0 (HW ver: 42) created with 4 DMA channels
phydev is null

/etc/nv_tegra_release reports:

# R38 (release), REVISION: 2.0, ... BOARD: generic, EABI: aarch64

I tried reflashing the Jetson following the Re-enable USB stick installation — Jetson AGX Thor Developer Kit - User Guide and ran into the same issues. I am not completely sure what is going wrong. Thank you.

Best,

Daniel

Hi,

Please move to rel-38.2.1 first.

As for MGBE, may I know what is the design you are using?

Hi,

Thanks for the quick response. As for moving to the newer version, I am unsure on how to. I tried sudo apt update, then sudo apt upgrade ( How to upgrade r38.2.1 - #2 by whitesscott ). It still said that I am on 38.2.0. To be honest this is less of an issue since I can just connect in headless and it seems like the power is pretty stable then.

Also, I am not sure about the MGBE design. I believe it is MGBE (multi-gig) MAC? I thought that if we plugged in the ethernet port to an external device, at least one of our mgbeX_0 would turn to state “UP”. I got it working on my regular linux machine but am unable to get the Thor to even recognize that an ethernet is plugged into it. Would appreciate some info about how to debug this issue. Thank you

Are you talking about a “shutdown” or a “reboot”?

Also, I am not sure about the MGBE design. I believe it is MGBE (multi-gig) MAC? I thought that if we plugged in the ethernet port to an external device, at least one of our mgbeX_0 would turn to state “UP”. I got it working on my regular linux machine but am unable to get the Thor to even recognize that an ethernet is plugged into it. Would appreciate some info about how to debug this issue. Thank you

Sorry. I just notice you are talking about NV devkit. Please ignore my previous question.
How is your connection there? What cable are you using?

It seems like it is crashing and trying to reboot. Sometimes it will make it past the initial loading screen and get to the user sign in but usually not. Again only when attached to accessories and not a huge issue. Was just bringing it up in case it was related to our other issue.

Our internet speed is pretty fast (67 mbps download and 140 mbps upload). It says CAT6 Patch for GB Ethernet 550 MHz rohs 2506 on the cable.

Thank you

Hi,

For the reboot issue, rel-38.2.1 shall fix this. You could try to flash it with sdkmanager.

Our internet speed is pretty fast (67 mbps download and 140 mbps upload). It says CAT6 Patch for GB Ethernet 550 MHz rohs 2506 on the cable.

I don’t know what you are trying to tell here. I mean what cable are you connecting on the MGBE? That one is not a RJ45 but a QSFP module.

Sorry I think I misunderstood about the internet speed. We connected a Cat6 Ethernet patch cable with RJ-45 ends. We are connecting it into port 2 in the following image. I was under the impression that that was a standard RJ45 Ethernet Port.

That one is not a MGBE. The MGBE is the port 6 in your picture.

I think I got the ports severely wrong. I am trying to connect an RJ45 cable in to port 2 to speak with a robotic arm over TCP. I assumed that it would show up in ip addr but I could not see eth0 so I assumed mgbe0_0 was our ethernet RJ45 port.

Here is my ip addr (masked):

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host noprefixroute
       valid_lft forever preferred_lft forever
2: can0: <NOARP,ECHO> mtu 16 qdisc noop state DOWN group default qlen 10
    link/can
3: can1: <NOARP,ECHO> mtu 16 qdisc noop state DOWN group default qlen 10
    link/can
4: can2: <NOARP,ECHO> mtu 16 qdisc noop state DOWN group default qlen 10
    link/can
5: mgbe0_0: <BROADCAST,MULTICAST> mtu 1466 qdisc mq state DOWN group default qlen 1000
    link/ether XX:XX:XX:XX:XX:XX brd ff:ff:ff:ff:ff:ff
6: can3: <NOARP,ECHO> mtu 16 qdisc noop state DOWN group default qlen 10
    link/can
7: mgbe1_0: <BROADCAST,MULTICAST> mtu 1466 qdisc mq state DOWN group default qlen 1000
    link/ether XX:XX:XX:XX:XX:XX brd ff:ff:ff:ff:ff:ff
8: mgbe2_0: <BROADCAST,MULTICAST> mtu 1466 qdisc mq state DOWN group default qlen 1000
    link/ether XX:XX:XX:XX:XX:XX brd ff:ff:ff:ff:ff:ff
9: mgbe3_0: <BROADCAST,MULTICAST> mtu 1466 qdisc mq state DOWN group default qlen 1000
    link/ether XX:XX:XX:XX:XX:XX brd ff:ff:ff:ff:ff:ff
10: wlP1p1s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether XX:XX:XX:XX:XX:XX brd ff:ff:ff:ff:ff:ff
    inet 192.168.4.100/24 brd 192.168.4.255 scope global noprefixroute wlP1p1s0
       valid_lft forever preferred_lft forever
    inet6 fd62:c659:276c:1:XXXX:XXXX:XXXX:XXXX/64 scope global temporary dynamic
       valid_lft 603777sec preferred_lft 84885sec
    inet6 fd62:c659:276c:1:XXXX:XXXX:XXXX:XXXX/64 scope global dynamic mngtmpaddr noprefixroute
       valid_lft 2591924sec preferred_lft 604724sec
    inet6 2001:XXXX:XXXX:XXXX:XXXX:XXXX:XXXX:XXXX/64 scope global temporary dynamic
       valid_lft 492sec preferred_lft 492sec
    inet6 2001:XXXX:XXXX:XXXX:XXXX:XXXX:XXXX:XXXX/64 scope global dynamic mngtmpaddr noprefixroute
       valid_lft 492sec preferred_lft 492sec
    inet6 fe80::XXXX:XXXX:XXXX:XXXX/64 scope link noprefixroute
       valid_lft forever preferred_lft forever
11: l4tbr0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default qlen 1000
    link/ether XX:XX:XX:XX:XX:XX brd ff:ff:ff:ff:ff:ff
    inet 192.168.55.1/24 scope global l4tbr0
       valid_lft forever preferred_lft forever
    inet6 fe80::1/64 scope link tentative
       valid_lft forever preferred_lft forever
12: usb0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast master l4tbr0 state DOWN group default qlen 1000
    link/ether XX:XX:XX:XX:XX:XX brd ff:ff:ff:ff:ff:ff
13: usb1: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast master l4tbr0 state DOWN group default qlen 1000
    link/ether XX:XX:XX:XX:XX:XX brd ff:ff:ff:ff:ff:ff
14: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default
    link/ether XX:XX:XX:XX:XX:XX brd ff:ff:ff:ff:ff:ff
    inet 172.17.0.1/16 brd 172.17.255.255 scope global docker0
       valid_lft forever preferred_lft forever

Can I communicate over RJ45 with port 2 over TCP? Or will I need to use some sort of adapter from ethernet to usb? Thanks a ton.

I feel none of the interface in this table is that RJ45 one.

Please reboot your device and attach your full dmesg log to us to check.

Command is just sudo dmesg > log.txt and attach that txt file.

log.txt (125.9 KB)

Here is my log. Thanks

Could you share me the result of “lspci”?

lspci
0000:00:00.0 PCI bridge: NVIDIA Corporation Device 22e6
0000:01:00.0 3D controller: NVIDIA Corporation Device 2b00 (rev a1)
0001:00:00.0 PCI bridge: NVIDIA Corporation Device 22d8
0001:01:00.0 Network controller: Realtek Semiconductor Co., Ltd. RTL8852CE PCIe 802.11ax Wireless Network Controller (rev 01)
0005:00:00.0 PCI bridge: NVIDIA Corporation Device 22d8
0005:01:00.0 Non-Volatile memory controller: Sandisk Corp WD PC SN5000S M.2 2280 NVMe SSD (DRAM-less)

Here is lspci. Thank you.

Ok. We got a problem now that the ethernet chip is missing from the PCIe level.

Please do reflash your board with rel-38.2.1 first to see if it would be back or not.

If it does not, I would say maybe consider trying a RMA.

Ok, I am currently reflashing my board through the sdk manager. I will let you know if there are any changes in ip addr after. Thank you for the help.

I am not sure if I should keep posting in the same thread but now it does not look like my sdk manager can even recognize the connected Jetson Thor.

Want to clarify. Is this your first time doing flash or not?

Hi, sorry for the delay. This is not my first time flashing. I already used a thumb drive to flash the Jetpack 7.0 available online.

Sorry, your comment only told this is your first time to flash a Jetson.

Only sdkmanager truly flashed the board. The rest of method is not.

Please follow this page and also remember to put your Jetson into recovery mode before running sdkmanager.

Hi Wayne, thanks for all of the help. I can see the Ethernet as enP2p1s0 in ip addr now. I believe this is healthy:

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host noprefixroute
       valid_lft forever preferred_lft forever
2: mgbe0_0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1466 qdisc mq state DOWN group default qlen 1000
    link/ether [MAC-MASKED] brd ff:ff:ff:ff:ff:ff
3: can0: <NOARP,ECHO> mtu 16 qdisc noop state DOWN group default qlen 10
    link/can
4: can1: <NOARP,ECHO> mtu 16 qdisc noop state DOWN group default qlen 10
    link/can
5: mgbe1_0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1466 qdisc mq state DOWN group default qlen 1000
    link/ether [MAC-MASKED] brd ff:ff:ff:ff:ff:ff
6: can2: <NOARP,ECHO> mtu 16 qdisc noop state DOWN group default qlen 10
    link/can
7: can3: <NOARP,ECHO> mtu 16 qdisc noop state DOWN group default qlen 10
    link/can
8: mgbe2_0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1466 qdisc mq state DOWN group default qlen 1000
    link/ether [MAC-MASKED] brd ff:ff:ff:ff:ff:ff
9: mgbe3_0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1466 qdisc mq state DOWN group default qlen 1000
    link/ether [MAC-MASKED] brd ff:ff:ff:ff:ff:ff
10: enP2p1s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether [MAC-MASKED] brd ff:ff:ff:ff:ff:ff
    inet [IPv4-MASKED]/22 brd [IPv4-BROADCAST-MASKED] scope global dynamic noprefixroute enP2p1s0
       valid_lft 12885sec preferred_lft 12885sec
    inet6 [IPv6-MASKED]/64 scope global temporary dynamic
       valid_lft 604697sec preferred_lft 85786sec
    inet6 [IPv6-MASKED]/64 scope global dynamic mngtmpaddr noprefixroute
       valid_lft 2591900sec preferred_lft 604700sec
    inet6 [IPv6-MASKED]/64 scope global temporary dynamic
       valid_lft 469sec preferred_lft 469sec
    inet6 [IPv6-MASKED]/64 scope global dynamic mngtmpaddr noprefixroute
       valid_lft 469sec preferred_lft 469sec
    inet6 fe80::[LINK-LOCAL-MASKED]/64 scope link noprefixroute
       valid_lft forever preferred_lft forever
11: wlP1p1s0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN group default qlen 1000
    link/ether [MAC-MASKED] brd ff:ff:ff:ff:ff:ff
12: l4tbr0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default qlen 1000
    link/ether [MAC-MASKED] brd ff:ff:ff:ff:ff:ff
    inet [IPv4-MASKED]/24 scope global l4tbr0
       valid_lft forever preferred_lft forever
    inet6 fe80::1/64 scope link tentative
       valid_lft forever preferred_lft forever
13: usb0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast master l4tbr0 state DOWN group default qlen 1000
    link/ether [MAC-MASKED] brd ff:ff:ff:ff:ff:ff
14: usb1: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast master l4tbr0 state DOWN group default qlen 1000
    link/ether [MAC-MASKED] brd ff:ff:ff:ff:ff:ff
15: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default
    link/ether [MAC-MASKED] brd ff:ff:ff:ff:ff:ff
    inet 172.17.0.1/16 brd 172.17.255.255 scope global docker0
       valid_lft forever preferred_lft forever

Unfortunately, I changed two things to get it to work. I used the sdk manager instead of a usb thumb drive and I also updated to 38.2.1. I am not sure which properly fixed it. The power cycling is still happening although it is not a huge issue.