About NX reboot many times,the eth0 does not work

HI。
NX device reboot many times.the eth0 does not work.
the ip address is ok.bus ping network,not connectable。
through cmd: sudo ifconfig eth0 down sudo ifconfig eth0 up ,after the network is OK.

hello qitb,

could you please share how many test cycles you’d reproduce the failure.
for reference, please gather the detail logs which make eth0 failed, thanks

#! /bin/bash

system power on,and exe this shell to test the eth0 status.

while true
do
date
ping -c 8 -w 100 192.128.60.200
if [[ $? != 0 ]];then
echo "Ping fail!!! "
else
echo "Ping ok!!! "
fi

#  above,  test number counts:100   there is  about one count failed. "Ping fail".
#  exe “ifconfig”,find the eth0 local IP is ok(192.168.60.11),but ping fail。
# then exe the below operation:


sleep 20
    echo 'nvidia' | sudo -S ifconfig eth0 down
    sleep 5
    echo 'nvidia' | sudo -S ifconfig eth0 up
    sleep 5


# through the down  and up operation.will reduce the  probability of “ping fail”。
# exe the below “ping”, test number counts:150   there is  about one count failed. "Ping fail".
# so,the down and up etho,will not solve the problem.

#please check the reason.

#thanks.


sleep 5
echo "##############network restart##############"
ping -c 8 -w 100 192.128.60.200
    if [[ $? != 0 ]];then
	echo "Ping2 fail!!!!!!!!!!!!!!!!!!! "
	break
    else
            echo "Ping2 ok!!!!!!!!!!!!!!!!!!!!! "   
            shutdown -r now        
    fi                                

done

please share how many test cycles (or, how long) you’d reproduce the failure.

about 30s in a circle. About 2 hours or 3 hours.The eroor hanppen.

The content is shown above include shell cmd and describe.

hello qitb,

may I know which JetPack release you’re working with?
we’ve test for 100 reboot cycles, we cannot reproduce the same issue on r32.6.1/Xavier-NX.

hi。
jetpack 4.4.
this is Probabilistic problem,may not appear in 100 reboot cycles.
thanks.

hello qitb,

please gather logs while the issue happened, $ dmesg > klog.txt

hi,
We get the message from Debug uart port.
the file is attach,the content include many reboot information,please look the last time.
thanks.ReceivedTofile-COM3-2021_10_8_19-20-18.DAT (9.8 MB)

The last reboot is network error.

You didn’t remove the “quiet” inside your /boot/extlinux/extlinux.conf.

So actually your log didn’t provide useful kernel log at all… By default the kernel info is in silent mode.

hi,we remove the “quiet”,Get the log,one log is ok,another is fail.Please check the differ and solve .
dmesg_11_fail_10152106.log (57.4 KB)
dmesg_11_ok.log (62.7 KB)
thanks.

The dmesg tells that the eth0 is up even in your failure case. No other error is seen.

Please use tcpdump, and capture the packet through tools like wireshark when the error happens. Also, share the statistic result from ethtool.

hi, at the last of the log file:

1、the fail log show:
[ 26.083364] Bluetooth: BNEP (Ethernet Emulation) ver 1.3
[ 26.083384] Bluetooth: BNEP socket layer initialized
[ 86.602084] gpio tegra-gpio wake20 for gpio=52(G:4)
[ 86.617544] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready *************************

2、the oK log show:
[ 222.241922] gpio tegra-gpio wake20 for gpio=52(G:4)
[ 222.251940] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
[ 223.963570] eqos 2490000.ether_qos eth0: Link is Up - 100Mbps/Full - flow control rx/tx
[ 223.964227] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready *******************

3、please check the last show content one is:eth0 becomes ready. another is :eth0 link is not ready
the differ is the reason?

Oh, sorry missing that line.

Do you have syslog to check why it took 80 second to see this interface goes down?

hi, at the last of the log file:

1、the fail log show:
[ 26.083364] Bluetooth: BNEP (Ethernet Emulation) ver 1.3
[ 26.083384] Bluetooth: BNEP socket layer initialized
[ 86.602084] gpio tegra-gpio wake20 for gpio=52(G:4)
[ 86.617544] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready *************************

2、the oK log show:
[ 52.897468] sd 0:0:0:0: [sda] Write Protect is off
[ 52.897687] sd 0:0:0:0: [sda] Mode Sense: 43 00 00 00
[ 52.898164] sd 0:0:0:0: [sda] Write cache: disabled, read cache: enabled, doesn’t support DPO or FUA
[ 52.909159] sda: sda1
[ 52.911171] sd 0:0:0:0: [sda] Attached SCSI removable disk
[ 222.241922] gpio tegra-gpio wake20 for gpio=52(G:4)
[ 222.251940] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
[ 223.963570] eqos 2490000.ether_qos eth0: Link is Up - 100Mbps/Full - flow control rx/tx
[ 223.964227] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready *******************

3、the OK log is :52----222 the erorr log is:26—86.

4、the print methord is show below:
#! /bin/bash
while true
do
echo “###############TEST#################”
date
sleep 50
echo ‘nvidia’ | sudo -S ifdown -a
echo ‘nvidia’ | sudo -S ifup -a
echo “network ifdown&up”
ping -c 3 -w 10 192.168.60.200 > /dev/null 2>&1
if [ $? -eq 0 ];then
echo “network ok”
shutdown -r now
else
echo “network fail”
dmesg>dmesg.log
fi
done

thanks.

hi
No look the time begin on every line.May has not relation about the error.

When exe ifeth0 down and up ,the OK and Error log differ is show:

error is two lines,below:
[ 222.241922] gpio tegra-gpio wake20 for gpio=52(G:4)
[ 222.251940] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready

OK is added two lines,below:
[ 223.963570] eqos 2490000.ether_qos eth0: Link is Up - 100Mbps/Full - flow control rx/tx
[ 223.964227] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready *******************

Hi,

Can you try to clarify this issue more clearly?

Did you use any command to trigger this error?

I am not sure why you say such comment here. Does this error always need you to run ifconfig eth0 down and up to see it?

When exe ifeth0 down and up ,the OK and Error log

Also, did you check the syslog? Do you know where to check syslog? Looks like so far you are just checking kernel log.

When the net is unconnectable,then do ifeth0 down and up.
the OK log is :after down and up,the net is connectale.
the error log is: after down and up,the net is not resume。

thanks。