USB devices are disconnected for no reason

We used RTL8153 as Ethernet PHY and connected it to TK1 USB. During long term automatic test, RTL8153 was found to be disconnected and reconnect for no reason, which will cause IP lost.

Kernel log shows as below:

0308.357196] tegra-xhci tegra-xhci: Firmware timestamp: 2014-09-16 02:10:07 UTC, Falcon state 0x20
[10323.939137] tegra-xhci tegra-xhci: Firmware timestamp: 2014-09-16 02:10:07 UTC, Falcon state 0x20
[10328.939207] tegra-xhci tegra-xhci: Firmware timestamp: 2014-09-16 02:10:07 UTC, Falcon state 0x20
[10458.576201] tegra-xhci tegra-xhci: Firmware timestamp: 2014-09-16 02:10:07 UTC, Falcon state 0x20
[10863.151205] tegra-xhci tegra-xhci: Firmware timestamp: 2014-09-16 02:10:07 UTC, Falcon state 0x20
[85823.153362] tegra-xhci tegra-xhci: Firmware timestamp: 2014-09-16 02:10:07 UTC, Falcon state 0x20
[85823.280701] usb 2-1: USB disconnect, device number 2
[85823.533869] usb 2-1: new SuperSpeed USB device number 3 using tegra-xhci
[85823.544842] usb 2-1: Parent hub missing LPM exit latency info. Power management will be impacted.
[85823.545733] usb 2-1: New USB device found, idVendor=0bda, idProduct=8153
[85823.545739] usb 2-1: New USB device strings: Mfr=1, Product=2, SerialNumber=6
[85823.545743] usb 2-1: Product: USB 10/100/1000 LAN
[85823.545747] usb 2-1: Manufacturer: Realtek
[85823.545751] usb 2-1: SerialNumber: 000001
[85823.650135] usb 2-1: reset SuperSpeed USB device number 3 using tegra-xhci
[85823.660841] usb 2-1: Parent hub missing LPM exit latency info. Power management will be impacted.
[85823.705391] r8152 2-1:1.0 eth0: v2.09.RC2 (2017/01/26)
[85823.705398] r8152 2-1:1.0 eth0: This product is covered by one or more of the following patents:
[85823.705398] US6,570,884, US6,115,776, and US6,327,625.
[85823.705398]

Hi yhao,
Do you run any application when connecting the Ethernet? Or simply keep in system idle? How long does it take to see disconnection happen?

I am also wondering if you can try with an externally powered HUB…I notice the power management note and there is some minor possibility that under load power consumption may change (and I believe the ethernet is wired internally through USB so you’d want to know if the behavior goes away when power consumption by the external devices is no longer part of the environment).

Thank @linuxdev for the suggestion. It is worth trying externally powered HUB.

Thanks for your suggestions.

First, yes, there is a application running on board. From code, we don’t find any command to disconnect USB devices in application. Is there any interface from which we can make sure the disconnection is caused by user space command or not?

@linuxdev, RTL8153 is wired internally. Do you mean to find a USB Ethernet dongle and replace RTL8153 as ethernet interface, then run the test?

I was thinking of any external USB devices with a possible significant power consumption and not anything specific influencing the dropout…keyboard and mouse would not matter, but many other devices could, e.g., external USB drives. So it sounds like the R8153 PHY you are talking about on USB is custom and hard wired without going through an external port…is the board itself custom?

One thing to consider is that if devices consume more power over USB while under load (or if any device using that same power bus increases power use under heavy load) it doesn’t really matter if it is integrated or not…the idea is that whatever bus powers this may be dropping out. It would just be convenient if an external HUB could replace one power source with a known more powerful source. If you have no possibility of detaching an external USB device which consumes significant power to reduce consumption of the power bus, then it might be worth looking at the bus powering the device before and after dropout (you’d need to watch the exact voltage delivered during the process of going from functioning to failing…it could be nothing but a momentary spike). Power delivery more or less needs to be confirmed before you can look at marginal signals or software issues since you don’t have an OOPS or smoking gun log message to look at.

If not tried yet, you may also disable usbcore_autosuspend, preferably at boot time.