Reboot and eth0 problem:

hello,
i works on jetson xavier nx with jetpack5.0.2 and deepstream6.1.
i met an reboot issue many times, after reboot command box can not enter systems because lost ethernet0 sometimes, or router restart it will be same issue sometimes.
then have to plug out power cable and try again.
could you give an advice for the problem or do you have an patch for that? i found many people met samilar issue in forums… i am sure that’s an ethernet problem and i used static IP.

thank you very much!

i set also never sleep at ubuntu20.04.

thank you

Few things to check

  1. What is the “ethtool -S eth0” output when error happened? We want to check the error counter value.
  2. Please try to disable EEE function of ethernet. ethtool --set-eee eth0 eee off. And see if it enhances
  3. Is the eth0 still able to ping other device when error happened?

helllo,
for 1&3 i can do nothing when that happenned because i can not enter system at all after reboot,
for 2, i have set that as your kindly advice already, and i test more times at this weekend for the problem again then let you know results.

thank you very much!

Get a serial console boot log. If there is a “quiet” in your extlinux.conf, then remove that (it sounds like you can boot correctly at least part of the time; get the log for a failure case).

please dump the log as linuxdev’s suggestion here. We cannot tell what goes wrong by just the behavior/symptom in your comment.

Also, is this NV devkit or a custom board?

ok i will try to get log for you analysis, and the problem bothered me for a long time.
i did ethtool --set as your advice and tried again by reboot command, it happens still and frequency is in 4-5 times.
correct issue is:
1\reboot or sudo reboot,
2\when happens error it is black screen, some times no green light, and some times with green light, it seems box power shut down or standby.
3\then i have to power off and power on again to restart, sometimes i have to plug out power cable from box and plug in again then restart.
4\i have 3 boxes of NX in hand for code works now, but all 3 boxes are same issue.

anyway i will catch logs for your help, i should get a serial port tools firstly, and could you advice a document for how to get log from serial?

thank you very much!

this is dmesg log that i failed and restart again, not sure if it is useful for you.

thank you very much!
dmesg (97.3 KB)
dmesg.0 (95.5 KB)

my NX devkit and not custom board,
i got them from JD for software works one year more and testing in a project to control IP camera and some sensors now.

Hi,

Is this issue ever happened to your board if you use nv devkit + jp4.x?

i used NV devkit + jetpack5.0.2

and it never happened before i used jetpack4.6.1 when i used before with ubuntu18.04 version.

Hi,

Sorry that I am not sure about your error. Your dmesg does not provide error log either.

What is the exact behavior? You can also describe it in Chinese if that could be more clear.

hi,
三套开发套件都是从京东买的,我在上面做软件开发,现在已经开始工程测试。两套盒子部署在户外,经常要更新和调试,所以经常需要远程reboot。我的问题是:
1、reboot or sudo reboot;
2、很多时候系统就不能启动,黑屏。这时候就必须要断电重新上电才能重启,很多时候断电都重启不了,必须要拔掉盒子插口的电源线才能重启成功。
3、我在本地一套开发盒子上测试,重启失败的频率是4-5次就会发生,然后重新上电,更多时候是必须要拔掉盒子的电源线重新插一下。
4、之前我认为是网口的问题,现在看也可能和电源管理有关系,就是重启失败之后,系统自动关机或者进入了待机状态,因为发生问题之后,有时候板子上的指示灯亮,有时候不亮,我外接了一个USB风扇在户外的盒子上,重启指令失败后,风扇有时候在转,有时候风扇停了,相当于板子没有被供电或者系统待机。
5、我以前一直在用JETPACK4.6.1的UBUNTU18.04版本,从来没有发生过这个问题。是更新成jetpack5.0.2和ubuntu20.04之后才出现这个问题的。

谢谢!

Hi,

Xavier NX has some known boot issue in jp5.0.2 that has not fixed yet. Is it possible to keep in jp4.6.1 for now?

hi,
更改回4.6.1对于我是一个庞大的工程,涉及到算法模型,CUDA,CUBLAS,主要是没有时间。。。。我暂时还是用这个版本做工程测试吧。
刚刚我又把网线拔掉了测试本地开发盒子,reboot十几次以后还是有黑屏这个问题,这时候板子的指示灯亮,但是按电源板的按键给盒子上不了电,断电再上电后板子指示灯一直不亮。然后只能拔掉NV板子电源插口的电源线,之后才能重新上电重启。
以上测试供你参考,这个是5.0.2的一个致命的系统bug。
请不要关闭这个topic,如果以后有更新请通知一下。。。

谢谢!

hi,
i would like to ask again about above reboot problem for jetson nx, is it solved or not right now?
i use jetpack 4.6.1 untill now but i do want to update to deepstream6.2, is it ok?

thank you very much!

Hi,

Actually, we didn’t know what is the exact issue you asked before. But some issues are indeed fixed on rel-35.2. Thus, you can try to upgrade to it.

before problem is that jetson nx will be died after few times reboot continuously.

Hi,

厄… 我想這邊有個事情/邏輯可以澄清一下. 一台機器忽然重開機可以有很多原因. 各種driver的panic都有可能造成reboot.
我回去看了一下你給過的log file. 它們都跟我們目前已知跟已經解掉的NX known issue沒有關聯.
所以你一直問說 " 這個問題解掉了沒 這個問題解掉了沒", 老實說我只能跟你說, 我們不知道/無法保證.

如果你想要確認問題, 麻煩用UART serial console抓log. 因為dmesg可能沒有抓到reboot panic的主因.


I would like to clarify this issue again. Actually a board could reboot by various reasons and every panic from each driver could lead to reboot.

After checking your dmesg, I notice your issue seems not a known and fixed issue on NX. Thus, I actually cannot answer your question because we don’t know the panic cause that leads to reboot.

Please use serial console to check the reboot log instead of using dmesg.