Networking issue with TX1+Orbitty

Hello,

we are using Orbitty-Boards together with TX1 Modules. Several times a day on some boards, networking fails.
The link and the interface (eth0) go down und up again which currently makes the boards unusable in production.
The cause of the USB-Resets on usb 2-2 is also unclear. We have some usb3 streaming device on 2-2…

[709946.682906] tegra_soctherm 700e2000.soctherm: soctherm: trip temperature -2147483647 forced to -127000
[709951.382911] tegra_soctherm 700e2000.soctherm: soctherm: trip temperature -2147483647 forced to -127000
[710192.826664] r8152 2-1:1.0 eth0: carrier off
[710196.403137] r8152 2-1:1.0 eth0: carrier on
[710197.852368] tegra_soctherm 700e2000.soctherm: soctherm: trip temperature -2147483647 forced to -127000
[710198.277731] usb 2-2: reset SuperSpeed USB device number 3 using xhci-tegra
[710199.323746] xhci-tegra 70090000.xusb: WARN Event TRB for slot 2 ep 2 with no TDs queued?
[710199.546349] xhci-tegra 70090000.xusb: WARN Event TRB for slot 2 ep 2 with no TDs queued?
[710199.569051] xhci-tegra 70090000.xusb: WARN Event TRB for slot 2 ep 2 with no TDs queued?
[710203.130564] tegra_soctherm 700e2000.soctherm: soctherm: trip temperature -2147483647 forced to -127000
[710206.349888] usb 2-2: reset SuperSpeed USB device number 3 using xhci-tegra
[710207.395931] xhci-tegra 70090000.xusb: WARN Event TRB for slot 2 ep 2 with no TDs queued?
[710207.625157] xhci-tegra 70090000.xusb: WARN Event TRB for slot 2 ep 2 with no TDs queued?
[710207.644386] xhci-tegra 70090000.xusb: WARN Event TRB for slot 2 ep 2 with no TDs queued?
[710215.224290] r8152 2-1:1.0 eth0: carrier off
[710218.899884] r8152 2-1:1.0 eth0: carrier on
[710224.588945] tegra_soctherm 700e2000.soctherm: soctherm: trip temperature -2147483647 forced to -127000
[710226.378327] usb 2-2: reset SuperSpeed USB device number 3 using xhci-tegra
[710227.422820] xhci-tegra 70090000.xusb: WARN Event TRB for slot 2 ep 2 with no TDs queued?
[710227.644532] xhci-tegra 70090000.xusb: WARN Event TRB for slot 2 ep 2 with no TDs queued?
[710227.663596] xhci-tegra 70090000.xusb: WARN Event TRB for slot 2 ep 2 with no TDs queued?
[710227.983833] tegra_soctherm 700e2000.soctherm: soctherm: trip temperature -2147483647 forced to -127000
[710302.199902] r8152 2-1:1.0 eth0: carrier off
[710306.036460] r8152 2-1:1.0 eth0: carrier on
[710307.954794] tegra_soctherm 700e2000.soctherm: soctherm: trip temperature -2147483647 forced to -127000
[710308.412095] usb 2-2: reset SuperSpeed USB device number 3 using xhci-tegra
[710309.458969] xhci-tegra 70090000.xusb: WARN Event TRB for slot 2 ep 2 with no TDs queued?
[710309.499145] tegra_soctherm 700e2000.soctherm: soctherm: trip temperature -2147483647 forced to -127000
[710309.685637] xhci-tegra 70090000.xusb: WARN Event TRB for slot 2 ep 2 with no TDs queued?
[710309.705547] xhci-tegra 70090000.xusb: WARN Event TRB for slot 2 ep 2 with no TDs queued?
[710507.594800] tegra_soctherm 700e2000.soctherm: soctherm: trip temperature -2147483647 forced to -127000
[711570.761692] r8152 2-1:1.0 eth0: carrier off
[711574.402485] r8152 2-1:1.0 eth0: carrier on
[715995.916882] usb 2-2: reset SuperSpeed USB device number 3 using xhci-tegra
[715996.960027] xhci-tegra 70090000.xusb: WARN Event TRB for slot 2 ep 2 with no TDs queued?
[715997.182024] xhci-tegra 70090000.xusb: WARN Event TRB for slot 2 ep 2 with no TDs queued?
[715997.200772] xhci-tegra 70090000.xusb: WARN Event TRB for slot 2 ep 2 with no TDs queued?

Sometimes there is a Tx status -2 seconds before carrier off:

2019-03-06T08:25:04.881858+00:00 kernel: r8152 2-1:1.0 eth0: Tx status -2
2019-03-06T08:25:04.888024+00:00 kernel: r8152 2-1:1.0 eth0: Tx status -2
2019-03-06T08:25:04.894368+00:00 kernel: r8152 2-1:1.0 eth0: Tx status -2
2019-03-06T08:25:07.111090+00:00 kernel: r8152 2-1:1.0 eth0: carrier off
2019-03-06T08:25:07.248577+00:00 kernel: r8152 2-1:1.0 eth0: carrier on

When there is some usb-autosuspend enabled, the interface is set to 100MBit after carrier on.
It’s only advertising 10/100 Full and ethtool is unable to change anything about it.
Disabling autosuspend only fixes the autoneg issue but not the carrier off/on issue.

Hi maax,

Please put your module back to devkit and start the test again.
Also, TX1 ethernet is actually using usb interface, so maybe these two error ar e related.

You could enable more log by

echo 0 > /proc/sys/kernel/printk

Please share the error log again after the test on devkit is done.