unstable USB issue on TX2

Hello

I’m having unstable USB connection issues on TX2, as described below.

1. Unstable card reader(Prior)

I attach an SD card reader on the USB SS1, and found that when using large space device like 128GB or micro SD card with a card adapter, it mostly can’t successfully mount the card, or even if it does it takes a very long time.

[  870.878617] scsi 14:0:0:0: Direct-Access     Generic  MassStorageClass 1616 PQ: 0 ANSI: 6
[  871.313912] sd 14:0:0:0: [sda] Spinning up disk...
[  872.320961] ....................................................................................................not responding...
[  971.731017] sd 14:0:0:0: [sda] Read Capacity(10) failed: Result: hostbyte=0x00 driverbyte=0x08
[  971.739783] sd 14:0:0:0: [sda] Sense Key : 0x2 [current]
[  971.745391] sd 14:0:0:0: [sda] ASC=0x4 ASCQ=0x1
[  971.751519] sd 14:0:0:0: [sda] Test WP failed, assume Write Enabled
[  971.759279] sd 14:0:0:0: [sda] Asking for cache data failed
[  971.764931] sd 14:0:0:0: [sda] Assuming drive cache: write through
[  971.776878] sd 14:0:0:0: [sda] Spinning up disk...
[  971.784497] sd 14:0:0:0: [sda] Spinning up disk...
[  972.784954] .
[  972.792973] .....
[ 1071.184964] .
[ 1071.576957] .
[ 1072.188968] .not responding...
[ 1072.194975] sd 14:0:0:0: [sda] Read Capacity(10) failed: Result: hostbyte=0x00 driverbyte=0x08
[ 1072.203612] sd 14:0:0:0: [sda] Sense Key : 0x2 [current]
[ 1072.209035] sd 14:0:0:0: [sda] ASC=0x4 ASCQ=0x1
[ 1072.216415] sd 14:0:0:0: [sda] Attached SCSI removable disk
[ 1072.584961] .not responding...
[ 1072.591815] sd 14:0:0:0: [sda] Spinning up disk...
[ 1073.596967] ..................................................................................
[ 1155.645115] usb 2-2: USB disconnect, device number 14
[ 1155.665662] xhci-tegra 3530000.xhci: tegra_xhci_mbox_work mailbox command 6
[ 1155.932958] .ready
[ 1156.979617] xhci-tegra 3530000.xhci: tegra_xhci_mbox_work mailbox command 5
[ 1156.986658] xhci-tegra 3530000.xhci: tegra_xhci_mbox_work ignore firmware MBOX_CMD_DEC_SSPI_CLOCK request

Besides, after luckily mounted, it always works at a very poor performance, for example a “sync” system call will block like forever, or coming up with lots of I/O error during the SD communication.

And when I switch the card reader to USB SS0, through a USB3.0 Type-A port, everything works well. So I believe the reason should lay in the hardware circuit or software configuration of SS1.

2. Unstable USB enumeration

I have 6 USB cameras connecting to the USB Bus-1 through 2 separate USB2.0 Hub, and use extra GPIO to control their power, and all the cameras is powered by external circuit instead of the Hub only.
The problem is that the USB2.0 Hub is very easy to fail the enumeration, and every time it fails the whole enumeration of all cameras will takes another 4~5 seconds(even if everything go normally sometime, the progress still lasts for like 6~7 seconds, which we would also like to optimize if possible). And sometimes some of the cameras fails.

USB connection on USB bus:

Bus 01.Port 1: Dev 1, Class=root_hub, Driver=xhci-tegra/4p, 480M
    |__ Port 2: Dev 34, If 0, Class=Hub, Driver=hub/4p, 480M
        |__ Port 2: Dev 42, If 0, Class=Vendor Specific Class, Driver=usbfs, 480M
        |__ Port 3: Dev 38, If 0, Class=Vendor Specific Class, Driver=usbfs, 480M
        |__ Port 4: Dev 40, If 0, Class=Vendor Specific Class, Driver=usbfs, 480M
    |__ Port 3: Dev 32, If 0, Class=Hub, Driver=hub/4p, 480M
        |__ Port 1: Dev 39, If 0, Class=Vendor Specific Class, Driver=usbfs, 480M
        |__ Port 2: Dev 41, If 0, Class=Vendor Specific Class, Driver=usbfs, 480M
        |__ Port 3: Dev 43, If 0, Class=Vendor Specific Class, Driver=usbfs, 480M

Kernel logging when hub enumeration fails:

[ 1185.925917] usb 1-3.4: new high-speed USB device number 9 using xhci-tegra
[ 1185.934389] usb 1-3-port4: cannot reset (err = -71)
[ 1185.939365] usb 1-3-port4: cannot reset (err = -71)
[ 1185.944335] usb 1-3-port4: cannot reset (err = -71)
[ 1185.949301] usb 1-3-port4: cannot reset (err = -71)
[ 1185.954273] usb 1-3-port4: cannot reset (err = -71)
[ 1185.959174] usb 1-3-port4: Cannot enable. Maybe the USB cable is bad?
[ 1185.965685] usb 1-3-port4: cannot disable (err = -71)
[ 1186.081919] usb 1-2: new high-speed USB device number 10 using xhci-tegra
[ 1186.214256] usb 1-3-port4: cannot reset (err = -71)
[ 1186.219291] usb 1-3-port4: cannot reset (err = -71)
[ 1186.224213] usb 1-2: New USB device found, idVendor=0451, idProduct=8142
[ 1186.230927] usb 1-2: New USB device strings: Mfr=0, Product=0, SerialNumber=1
[ 1186.238068] usb 1-2: SerialNumber: A5030849DC80
[ 1186.242673] usb 1-3-port4: cannot reset (err = -71)
[ 1186.247649] usb 1-3-port4: cannot reset (err = -71)
[ 1186.252627] usb 1-3-port4: cannot reset (err = -71)
[ 1186.257534] usb 1-3-port4: Cannot enable. Maybe the USB cable is bad?
[ 1186.264190] usb 1-3-port4: cannot disable (err = -71)
[ 1186.269460] hub 1-2:1.0: USB hub found
[ 1186.273307] hub 1-2:1.0: 4 ports detected
[ 1186.277451] usb 1-3-port4: cannot reset (err = -71)
[ 1186.282561] usb 1-3-port4: cannot reset (err = -71)
[ 1186.287697] usb 1-3-port4: cannot reset (err = -71)
[ 1186.292701] usb 1-3-port4: cannot reset (err = -71)
[ 1186.297702] usb 1-3-port4: cannot reset (err = -71)
[ 1186.302729] usb 1-3-port4: Cannot enable. Maybe the USB cable is bad?
[ 1186.309253] usb 1-3-port4: cannot disable (err = -71)
[ 1186.314753] usb 1-3-port4: cannot reset (err = -71)
[ 1186.319737] usb 1-3-port4: cannot reset (err = -71)
[ 1186.324701] usb 1-3-port4: cannot reset (err = -71)
[ 1186.329663] usb 1-3-port4: cannot reset (err = -71)
[ 1186.334639] usb 1-3-port4: cannot reset (err = -71)
[ 1186.339542] usb 1-3-port4: Cannot enable. Maybe the USB cable is bad?
[ 1186.346043] usb 1-3-port4: cannot disable (err = -71)
[ 1186.351125] usb 1-3-port4: unable to enumerate USB device
[ 1186.356619] usb 1-3-port4: cannot disable (err = -71)

Kernel logging when cameras enumeration fails:

[  467.989535] usb 1-3.3: new full-speed USB device number 10 using xhci-tegra
[  468.069703] usb 1-3.3: device descriptor read/64, error -32
[  468.249703] usb 1-3.3: device descriptor read/64, error -32
[  468.429526] usb 1-2.4: new full-speed USB device number 11 using xhci-tegra
[  468.509700] usb 1-2.4: device descriptor read/64, error -32
[  468.689701] usb 1-2.4: device descriptor read/64, error -32
[  468.750704] usb 1-3: USB disconnect, device number 8
[  469.113883] usb 1-2: USB disconnect, device number 9
[  469.229531] usb 1-2: new full-speed USB device number 16 using xhci-tegra
[  469.349689] usb 1-2: device descriptor read/64, error -71
[  469.569687] usb 1-2: device descriptor read/64, error -71
[  469.789527] usb 1-2: new full-speed USB device number 17 using xhci-tegra
[  469.909710] usb 1-2: device descriptor read/64, error -71
[  470.129688] usb 1-2: device descriptor read/64, error -71
[  470.349536] usb 1-2: new full-speed USB device number 18 using xhci-tegra
[  470.356817] usb 1-2: Device not responding to setup address.
[  470.565997] usb 1-2: Device not responding to setup address.
[  470.773525] usb 1-2: device not accepting address 18, error -71

Hi EvanKwok,
There might be bandwidth issue in inserting multiple usb cameras into single USB3 port. We suggest you try PCIe-USB hub such as https://devtalk.nvidia.com/default/topic/1027100

Hi DaneLLL,

Is the SD card issue also relative to the USB bandwidth? I’m actually not even running any camera at that time. At least when I switch to SS0 everything seems normal.

And in terms of the camera enumeration, will this make sense that I try to adjust the bandwidth request on the device side? We’ve got no chance to modify the hardware, hopefully we can solve these issues on current conditions.

Hi EvanKwok,
USB_SS1 is not enabled on default carrier board. Looks like you run your own custom board?
Which is your USB config in oem design guide?
https://developer.nvidia.com/embedded/dlc/jetson-tx2-tx2i-oem-product-designguide

Hi DaneLLL

Yes we’re running on our own custom board, and we’re using config #4

Hi EvanKwok,
Below is a sample device tree for config #4.
[url]https://devtalk.nvidia.com/default/topic/1030635/jetson-tx2/tx2-config-4-for-usb-lane-mapping/post/5243174/#5243174[/url]

In config#4, PEX_RFU is enabled. Do you observe the same issue on PEX_RFU? Are there USB3 Type A ports connected to both PEX_RFU and USB_SS1 on your custom board?

\

Hi DaneLLL

Sorry I just confirm with our hardware engineer, we have modified the layout before. The SD card slot is actually connecting to the PEX_RFU, and behaving normally when switch to SS0.

Hi EvanKwok,
Can you try USB_SS1 also? It is enabled in config #4.

Have you inspected device tree?

Also please check VBUS
[url]https://devtalk.nvidia.com/default/topic/1036547/jetson-tx1/usb2-b43-b42-not-working-on-tx1-with-r28-2/post/5265708/#5265708[/url]

Hi DaneLLL

Sorry for replying late.

We have tried USB_SS0 and everything work fine, but not yet on USB_SS1, because we use it on the OTG port and it’s now not recognizing USB3.0 device, but that’s another issue.

Yes we checked the device tree, and it’s confirmed with our supplier FAE.

And we also tried pulling down and then up the VBUS when the issue occurs, but it seems nothing change after the operation.

Hi EvanKwok,
Please attach HW layout of USB lane mapping for review.

Hi DaneLLL,

Please check the attachment


Hi EvanKwok,
https://developer.nvidia.com/embedded/dlc/jetson-tx2-tx2i-oem-product-designguide
It might be an issue in the HW design of USB3 to SD in PEX_RFU. Have you checked the oem design guide?

Hi EvanKwok, please check and confirm your design had followed the “Signal Routing Requirements” of USB part in OEM DG.

Hi DaneLLL & Trumany

I let my HW engineer have a check on the document, and he didn’t find out anything wrong. Can you point out the exact problem?

You mean the impedance, trace length, pair skew, uncoupled length…etc. all match the request of OEM DG? Did you check the USB signal quality with scope? As it looks like an unstable signal issue, you’d better to check the USB signal quality first to confirm no HW issue.

We’re also confused by the different outcomes between 64GB and 128GB. The latter is where the issue occurs, while the former is much more stable.

How many different 128GB cards did you try? There might be compatible problem to some cards.

Hi Trumany

Please check the attached report about the USB signal quality.

USB3_1 Device 1_Report_2018-08-31_USB-SD.mht.7z (352 KB)

We tried like 4 models, and we saw the error in kernel is I/O failures with SD device, which seems not to be the compatible issue in general?

No fail items seen in your report, so the signal quality should be good enough. Seems no connection issue. I have no clue from HW perspective.