A device at the customer site was restarted repeatedly. The specific phenomenon is that the IP can be pinged for a while, and the IP cannot be pinged for a while. From the observation of the device indicator, the phenomenon is confirmed that the system is repeatedly restarted. We checked the network cable interface and found no problem. The following is the copied system log, please help to analyze the restart reason, thank you!
syslog.gz (3.6 MB)
kern.log.bz2.gz (7.4 MB)
Use the serial console to check. Syslog or kern.log won’t see the fatal cause.
The problem occurred at the customer site has been installed at the customer site, so there is no way to debug the serial port. Plus, we can’t reproduce the question in the office, so it’s harder to investigate. Is there any other way we can determine the cause of this problem?
What is connected to the Jetson at the customer site? Is it powered exactly the same way? Do you have the same peripherals? Little things like using a USB HUB might matter (or if one has an externally powered HUB and the other powers from the line). You are running from an NVMe, and so it would be important to know if (A) you are using the same NVMe, and (B) if there is any difference in how this NVMe is powered.
Also, you have an infinite error for USB Video Class:
uvcvideo: Failed to query (GET_CUR) UVC control 1 on unit 3: -32 (exp. 1024).
There is also a USB mouse issue right before the reboot in the log. I won’t show the whole thing since it is perhaps binary data, but it starts like this:
Oct 8 11:11:22 localhost kernel: [ 206.548982] hid-generic 0003:03F0:1F4A.0007: input,hidraw0: USB HID v1.11 Mouse [PixArt HP USB Optical Mouse] on usb-3610000.xhci-2.1.3/input0
^@^@^@^@^@^@^@
(it is a PixArt mouse; this might not be an error if raw data is being logged, but this is out of place)
This does not mean it is related to rebooting, but it is possible that reboot is USB related. Because (as @WayneWWW mentioned) this log does not include boot stages, and because the reboot is not within Linux, that very likely the issue is boot stages. Serial console covers boot stages, Linux logs do not (Linux is not even running in boot stages). Anything in the logs you posted might be useful, but it isn’t really sufficient. It leaves us guessing.
Incidentally, compressing with both .bz2
and .gz
will normally make the file larger. Max compression is normally with “bzip2 -9 <original file>
”.
The onsite devices are powered independently, and only one power port and one network port are exposed on the device cover. So there is no way to use the serial port debugging, in addition, this repeated restart is not always the case, I can not repeat this situation in the office.
If you can get other logs from this system, and if those logs always repeatedly fail at this spot (in the kern.log
), versus if the log ends somewhere else randomly, then the issue is one step closer to being solved. I’m interested in knowing if the reboot point is always the same point when at the customer site, versus not.
How difficult is it to get the customer to try things like booting without the mouse? I realize the unit may not be there anymore, and perhaps it is difficult to ask for more logs and such, but if possible, I’ll show you why this seems to be important.
In the kern
log you will see this repeated over and over:
Oct 8 11:10:32 localhost kernel: [ 154.975823] input: PixArt HP USB Optical Mouse as /devices/3610000.xhci/usb1/1-2/1-2.1/1-2.1.3/1-2.1.3:1.0/0003:03F0:1F4A.0006/input/input10
Oct 8 11:10:32 localhost kernel: [ 154.979429] hid-generic 0003:03F0:1F4A.0006: input,hidraw0: USB HID v1.11 Mouse [PixArt HP USB Optical Mouse] on usb-3610000.xhci-2.1.3/input0
Oct 8 11:11:20 localhost kernel: [ 203.342783] usb 1-2.1.3: USB disconnect, device number 9
Oct 8 11:11:22 localhost kernel: [ 204.867755] usb 1-2.1.3: new low-speed USB device number 10 using tegra-xusb
Oct 8 11:11:22 localhost kernel: [ 204.891873] usb 1-2.1.3: New USB device found, idVendor=03f0, idProduct=1f4a
Oct 8 11:11:22 localhost kernel: [ 204.891885] usb 1-2.1.3: New USB device strings: Mfr=1, Product=2, SerialNumber=0
Oct 8 11:11:22 localhost kernel: [ 204.891918] usb 1-2.1.3: Product: HP USB Optical Mouse
Oct 8 11:11:22 localhost kernel: [ 204.891925] usb 1-2.1.3: Manufacturer: PixArt
This goes on many times. At some point other logs appear (different processes are running, and so I would expect that even if this log is repeating there might be other drivers occasionally logging). Then this occurs and is the final timestamp before reboot logs start:
Oct 8 11:11:22 localhost kernel: [ 206.548982] hid-generic 0003:03F0:1F4A.0007: input,hidraw0: USB HID v1.11 Mouse [PixArt HP USB Optical Mouse] on usb-3610000.xhci-2.1.3/input0
^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@
^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@
^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@
^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@
^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@
^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@
^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@
^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@
^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@
^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@
^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@
^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@
^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@
^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@
^@^@^@^@^@^@^@^@^@^@^@Oct 8 11:08:01 localhost kernel: [ 0.000000] Booting Linux on physical CPU 0x0
The end of that last block of log shows “Booting from…” which is the start of boot. The final log line of the original boot does not even put a newline in the log. The printing of that line into the log ceased before the newline. It could be coincidence that the mouse driver was in the middle of saying something with binary gibberish, and that something else caused the reboot whereby the mouse was just unlucky in its timing. Considering how many times USB repeatedly disconnected and reconnected that exact mouse though the odds are high that this is USB related.
If a different USB HUB is used, or different cables, or if different USB peripherals are used, then any defect in that chain could be of interest. If the customer did not send the exact cables and USB peripherals back to you, then the lack of failure becomes more interesting at your end. Just having a longer cable at the customer end could be related if there is a signal quality issue.
It would be very useful to try booting with no USB connected. If the log changes, or if reboot stops happening, then the issue is almost certainly a USB peripheral or driver. If you merely have a few different logs, and if the binary gibberish occurs somewhere other than USB, then that too is a good clue. If the reboot always occurs within that PixArt mouse, then that is nearly a “smoking gun” evidence. A lot depends on being able to reproduce this.
Related to all of this, you get these errors, which are USB (but in syslog
):
Oct 8 11:13:07 localhost kernel: [ 320.102865] uvcvideo: Failed to query (GET_CUR) UVC control 1 on unit 3: -32 (exp. 1024).
Oct 8 11:13:08 localhost kernel: [ 321.355425] uvcvideo: Failed to query (GET_CUR) UVC control 1 on unit 3: -32 (exp. 1024).
Oct 8 11:13:09 localhost kernel: [ 322.608006] uvcvideo: Failed to query (GET_CUR) UVC control 1 on unit 3: -32 (exp. 1024).
UVC is the USB Video Class standard USB driver. Your camera also has issues on USB. Do note that if there is an issue with any USB device sharing a root HUB with another device, then some errors will break more than one USB device. It is quite possible the camera is breaking the mouse, or vice versa. There is a need to verify if the exact same peripherals are used.
One final thought: Many USB issues can occur if the device tree is not correct. I don’t see anything specifically saying the device tree is wrong, but if you have a custom device tree, then you might make sure it is loaded.
This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.