USB3 via PCIe card

I am trying to get additional USB3 ports to my jetson, and install a USB3 PCIe card, listed here http://www.amazon.com/gp/product/B009WN7SQA. This is the same one listed as untested on the wiki here http://elinux.org/Jetson/mini-PCIe.

I connected a USB3 camera ( IDS ueye ) and attempted to capture at full rate (1280x1024, 8bit greyscale, 60 fps) which comes to about 600Mbps. This configuration works fine with the built in USB3, but through the card I get usb transfer errors that cause tranfer to halt. I’ve also tried a much lower bandwidth transfer (320x240, 8bit, 30fps, 20Mbps), and also get transfer errors, though not as fequently. I’ve looked at the lsusb -t and the port and device are listed as 5000M.

I was wondering what might be causeing the poor performance. Is there a fix? Has anyone found another usb3 pcie card that works well? The end goal is to get 2 or 3 cameras transferring at the max rate listed above. Has anyone found a configuration that would enable this?

Thanks,
Daniel

Some information from lsusb would help. First, if you run “lsusb” by itself, it shows a brief format…identify the camera, then look at the “ID”. You can use the ID to pick the specific USB device. An example is that a TK1 in recovery mode has “ID” of “0955:7140”. Then use verbose mode and save this to file, e.g., if I were to do this on the TK1 in recovery mode the command would be (substitute with camera ID):

lsusb -d 0955:7140 -vvv | tee log_usb.txt

Also, the output of “lsusb -t” (tree view). Basically what is needed is a log of the camera when plugged into the working native USB connector, as well as the same log when the camera is plugged into the mini-PCIe USB for both the tree view and the verbose log.

here is lsusb -t for native

/:  Bus 05.Port 1: Dev 1, Class=root_hub, Driver=tegra-ehci/1p, 480M
/:  Bus 04.Port 1: Dev 1, Class=root_hub, Driver=tegra-xhci/2p, 5000M
    |__ Port 1: Dev 7, If 0, Class=Vendor Specific Class, Driver=usbfs, 5000M
/:  Bus 03.Port 1: Dev 1, Class=root_hub, Driver=tegra-xhci/6p, 480M
/:  Bus 02.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/2p, 5000M
/:  Bus 01.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/2p, 480M
    |__ Port 1: Dev 31, If 0, Class=Hub, Driver=hub/3p, 480M
        |__ Port 2: Dev 32, If 0, Class=Human Interface Device, Driver=usbhid, 1.5M
        |__ Port 2: Dev 32, If 1, Class=Human Interface Device, Driver=usbhid, 1.5M
        |__ Port 3: Dev 33, If 0, Class=Human Interface Device, Driver=usbhid, 1.5M

and verbose lsusb for native

Bus 004 Device 007: ID 1409:3240  
Device Descriptor:
  bLength                18
  bDescriptorType         1
  bcdUSB               3.00
  bDeviceClass          255 Vendor Specific Class
  bDeviceSubClass       255 Vendor Specific Subclass
  bDeviceProtocol       255 Vendor Specific Protocol
  bMaxPacketSize0         9
  idVendor           0x1409 
  idProduct          0x3240 
  bcdDevice            0.00
  iManufacturer           1 
  iProduct                2 
  iSerial                 0 
  bNumConfigurations      1
  Configuration Descriptor:
    bLength                 9
    bDescriptorType         2
    wTotalLength           57
    bNumInterfaces          1
    bConfigurationValue     1
    iConfiguration          0 
    bmAttributes         0x80
      (Bus Powered)
    MaxPower              224mA
    Interface Descriptor:
      bLength                 9
      bDescriptorType         4
      bInterfaceNumber        0
      bAlternateSetting       0
      bNumEndpoints           3
      bInterfaceClass       255 Vendor Specific Class
      bInterfaceSubClass    255 Vendor Specific Subclass
      bInterfaceProtocol    255 Vendor Specific Protocol
      iInterface              0 
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x82  EP 2 IN
        bmAttributes            2
          Transfer Type            Bulk
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0400  1x 1024 bytes
        bInterval               0
        bMaxBurst              15
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x01  EP 1 OUT
        bmAttributes            2
          Transfer Type            Bulk
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0400  1x 1024 bytes
        bInterval               0
        bMaxBurst              15
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x81  EP 1 IN
        bmAttributes            2
          Transfer Type            Bulk
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0400  1x 1024 bytes
        bInterval               0
        bMaxBurst              15

now here is lsusb -t for pcie

/:  Bus 05.Port 1: Dev 1, Class=root_hub, Driver=tegra-ehci/1p, 480M
/:  Bus 04.Port 1: Dev 1, Class=root_hub, Driver=tegra-xhci/2p, 5000M
/:  Bus 03.Port 1: Dev 1, Class=root_hub, Driver=tegra-xhci/6p, 480M
    |__ Port 3: Dev 7, If 0, Class=Hub, Driver=hub/3p, 480M
        |__ Port 2: Dev 8, If 0, Class=Human Interface Device, Driver=usbhid, 1.5M
        |__ Port 2: Dev 8, If 1, Class=Human Interface Device, Driver=usbhid, 1.5M
        |__ Port 3: Dev 9, If 0, Class=Human Interface Device, Driver=usbhid, 1.5M
/:  Bus 02.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/2p, 5000M
    |__ Port 2: Dev 114, If 0, Class=Vendor Specific Class, Driver=, 5000M
/:  Bus 01.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/2p, 480M

and now lsusub -vvv for pcie

Bus 002 Device 114: ID 1409:3240  
Device Descriptor:
  bLength                18
  bDescriptorType         1
  bcdUSB               3.00
  bDeviceClass          255 Vendor Specific Class
  bDeviceSubClass       255 Vendor Specific Subclass
  bDeviceProtocol       255 Vendor Specific Protocol
  bMaxPacketSize0         9
  idVendor           0x1409 
  idProduct          0x3240 
  bcdDevice            0.00
  iManufacturer           1 
  iProduct                2 
  iSerial                 0 
  bNumConfigurations      1
  Configuration Descriptor:
    bLength                 9
    bDescriptorType         2
    wTotalLength           57
    bNumInterfaces          1
    bConfigurationValue     1
    iConfiguration          0 
    bmAttributes         0x80
      (Bus Powered)
    MaxPower              224mA
    Interface Descriptor:
      bLength                 9
      bDescriptorType         4
      bInterfaceNumber        0
      bAlternateSetting       0
      bNumEndpoints           3
      bInterfaceClass       255 Vendor Specific Class
      bInterfaceSubClass    255 Vendor Specific Subclass
      bInterfaceProtocol    255 Vendor Specific Protocol
      iInterface              0 
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x82  EP 2 IN
        bmAttributes            2
          Transfer Type            Bulk
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0400  1x 1024 bytes
        bInterval               0
        bMaxBurst              15
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x01  EP 1 OUT
        bmAttributes            2
          Transfer Type            Bulk
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0400  1x 1024 bytes
        bInterval               0
        bMaxBurst              15
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x81  EP 1 IN
        bmAttributes            2
          Transfer Type            Bulk
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0400  1x 1024 bytes
        bInterval               0
        bMaxBurst              15

In both native and PCIe, it looks like full USB3 speed is enabled. However, the driver which the hotplug layer is handing off to changes depending on native or PCIe USB. For native tree view, this applies:

/:  Bus 04.Port 1: Dev 1, Class=root_hub, Driver=tegra-xhci/2p, 5000M
    |__ Port 1: Dev 7, If 0, Class=Vendor Specific Class, <b><i>Driver=usbfs</i></b>, 5000M

Note the “Driver=usbfs”. Now tree view under PCIe:

/:  Bus 02.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/2p, 5000M
    |__ Port 2: Dev 114, If 0, Class=Vendor Specific Class, <b><i>Driver=</i></b>, 5000M

…note that hotplug never handed this off to any driver (or there is a bug in lspci -t)! This would account for the differences between native and PCIe…both see the device and communicate properly, but hotplug is falling short and should show the same driver in both cases. In verbose view, both see the correct idVendor and idProduct, so talking to the camera is as it should be, and the error is not USB.

If you were to plug in the camera in the native USB, drivers would be associated with the camera. In the case of modules, the module would show via lsmod (if the feature is compiled into the kernel it would not show with lsmod). Unplugging the device from USB would probably not rmmod the driver, so lsmod to see what is visible as a module in comparison of native versus PCIe USB would require reboot between lsmod (we’re interested in what the system does automatically, not what we can do manually). So here is what to test (I’m pretty sure the driver is module format)…

With the TK1 off, unplug the camera. Power up. Run lsmod, note the output. Plug in the camera again…see if you spot any difference in modules from lsmod. Shutdown Jetson, remove the camera, boot back up. lsmod should be the same as the original lsmod with no camera. Now check lsmod while the camera is in the PCIe USB. How does lsmod differ between native with/without a camera, as well as how does it differ between PCIe USB with/without…and especially, how does with camera differ between native and PCIe USB? We’re looking for a chain of modules which are required for the camera, and how the modules differ (but should not) between the use of the two different USB connectors.

All 4 cases are identical

Module                  Size  Used by
dm_crypt               13259  0 
dm_mod                 73887  1 dm_crypt
rfcomm                 38359  0 
bnep                   10469  2 
bluetooth             307068  10 bnep,rfcomm
rfkill                 10365  3 bluetooth
nvhost_vi               3064  0

This one is a bit puzzling, I’m running into documentation issues figuring out what kernel CONFIG applies to this. There is supposed to be a CONFIG_USBFS, and a deprecated CONFIG_USBDEVFS (same thing, different naming), but I can’t find a reference to this in the 3.10.40 kernel config of TK1. This is probably irrelevant anyway, as you know the feature exists when you use your camera on the native USB port, and simply moving the camera from native to the PCIe USB port would not have any effect on kernel features existing or not.

Out of curiosity, if you cleanly boot the TK1, plug the camera into the native USB port, use it briefly, and then move it to the PCIe USB port (without rebooting), does this change anything? Does the lsusb -t show “Driver” differently if you first use the native port before using the PCIe port?

I’ve tried plugging into both ports after reboot now a few times, and the driver is always showing up as usbfs. I can’t reproduce the empty driver field I originally sent you.

However, I’m still experiencing the poor performance through the mini-PCIe card.

For the PCIe part of the issue, you can run “lspci”, which is a brief output similar to lsusb. From this you can find the line which is the mPCIe USB card. The left column will have identifying information so you can list just the one item, e.g., I have a PCI ethernet controller which is “00:0a.0”, which could be listed specifically and verbosely with log via (substitute with your mPCIe USB card information):

lspci -s '00:0a.0' -vvv | tee pci_log.txt

Let’s find out what the PCIe controller is thinking.

here it is

01:00.0 USB controller: Renesas Technology Corp. uPD720202 USB 3.0 Host Controller (rev 02) (prog-if 30 [XHCI])
	Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR+ FastB2B- DisINTx+
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0, Cache Line Size: 64 bytes
	Interrupt: pin A routed to IRQ 130
	Region 0: Memory at 32200000 (64-bit, non-prefetchable) 
	Capabilities: <access denied>
	Kernel driver in use: xhci_hcd

The lspci command requires root privilege to show the things required (note the “”). Try again but with “sudo lspci…”. Or first “sudo -s” for a root shell.

01:00.0 USB controller: Renesas Technology Corp. uPD720202 USB 3.0 Host Controller (rev 02) (prog-if 30 [XHCI])
	Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR+ FastB2B- DisINTx+
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0, Cache Line Size: 64 bytes
	Interrupt: pin A routed to IRQ 130
	Region 0: Memory at 32200000 (64-bit, non-prefetchable) 
	Capabilities: [50] Power Management version 3
		Flags: PMEClk- DSI- D1- D2- AuxCurrent=375mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
		Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
	Capabilities: [70] MSI: Enable- Count=1/8 Maskable- 64bit+
		Address: 0000000000000000  Data: 0000
	Capabilities: [90] MSI-X: Enable+ Count=8 Masked-
		Vector table: BAR=0 offset=00001000
		PBA: BAR=0 offset=00001080
	Capabilities: [a0] Express (v2) Endpoint, MSI 00
		DevCap:	MaxPayload 128 bytes, PhantFunc 0, Latency L0s unlimited, L1 unlimited
			ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
		DevCtl:	Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
			RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop+
			MaxPayload 128 bytes, MaxReadReq 512 bytes
		DevSta:	CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr+ TransPend-
		LnkCap:	Port #0, Speed 5GT/s, Width x1, ASPM L0s L1, Exit Latency L0s <4us, L1 unlimited
			ClockPM+ Surprise- LLActRep- BwNot-
		LnkCtl:	ASPM L1 Enabled; RCB 64 bytes Disabled- CommClk+
			ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
		LnkSta:	Speed 5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
		DevCap2: Completion Timeout: Not Supported, TimeoutDis+, LTR+, OBFF Not Supported
		DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
		LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis-
			 Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
			 Compliance De-emphasis: -6dB
		LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-, EqualizationPhase1-
			 EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
	Capabilities: [100 v1] Advanced Error Reporting
		UESta:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
		UEMsk:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
		UESvrt:	DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
		CESta:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
		CEMsk:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
		AERCap:	First Error Pointer: 00, GenCap- CGenEn- ChkCap- ChkEn-
	Capabilities: [150 v1] Latency Tolerance Reporting
		Max snoop latency: 0ns
		Max no snoop latency: 0ns
	Kernel driver in use: xhci_hcd

Looking at the PCIe information on the mini-PCIe slot USB3 card, I see it is running at the maximum generation 2 speed (5GT/s), and seems to be 100% correct. The same is true for the actual lsusb listing, the USB part of this mPCIe card seems to be correct and running at maximum speed.

So I looked a bit closer at the camera and transfer type. Transfer type is “bulk”. This would be appropriate for non-real-time, such as snapping a high resolution frame and then transferring a single frame (or a file transfer). The correct mode for a full motion video or audio device would be “isochronous”. Am I correct that this camera is full motion video?

Recently there was another video camera which was incorrectly using bulk transfer instead of isochronous due to a firmware issue. One thing which helped there (but isn’t the correct cure) was to increase buffer size for the USB. I suspect this is the reason why native USB3 works for you, but the add-on mPCIe USB “stutters” (different buffer sizes). With bulk transfer being incorrect, one port might work with a larger buffer, while the other shows symptoms the larger buffer would tend to make less obvious.

So, assuming this is full motion, do you have another desktop type Linux machine you could plug the camera in to and verify if “Transfer Type” is “bulk” or “isochronous”? I’d like to see if a baseline desktop has any control to pick something other than bulk (doubtful, transfer mode should be picked by the camera unless there is a bug in the USB driver falling back to bulk when isochronous is preferred). It is possible under windows you might be able to see the device transfer mode (either a device property or a driver property), and see if it is also bulk, but I’m not sure if windows has an easy way to see this particular parameter.

Your are correct that it si full motion video. I tried it in my laptop and lsusb gives transfer type as bulk also. I’m not getting any capture issues on my laptop. Is the bulk vs isochronos something that would be set in the call to the SDK? I am currently testing with the open source ueyedemo, so I could poke around and see if it is chosen somewhere.

I looked around in the SDK a bit and it is intentional that they choose bulk transfer, for its “error correction”. On windows the size can be changed through their sample cmaera manager program, but it does not look like that is the case in Linux

Bulk versus isochronous is part of the firmware chosen in the camera and communicated to the root HUB controller. This could not be changed without rewriting the USB code to not follow USB standards, or as an alternative, to change the firmware in the camera.

Buffers associated with USB ports can be a mix of buffer physically in one of the USB chips, or it can be part of the operating system, or it can be both. Think of it as similar to different cache levels in a CPU where there might be a small very very fast cache talking to a larger (but still small) fast cache, and this in turn talking to system memory. I’m not sure what is physically present in USB3 port support of the Tegra K1, but further buffering can be chosen for this via kernel command line parameters. How the USB port is connected to the Tegra K1 may differ from how the mini-PCIe USB3 port connects, and so more buffer may or may not be possible in the same way for add-on USB cards.

If you are familiar with network programming, a very common description of the difference between TCP and UDP is often given as “TCP detects and corrects for errors”, whereas “UDP sends on a best effort basis, and does not correct for out of order or missing packets”. Bulk versus isochronous USB has a similar comparison.

For bulk USB the device communicating with the root HUB can be told to stop or go, and order and success/failure of data transfer is monitored such that stop/start can be achieved without loss of data. A good example is a hard drive, where it may need to transfer a chunk of data, but halting for a moment won’t cause the drive to be erased. There is a moment of transition between start and stop where some data being sent may not actually make it through and on restart this must be accounted for…not hard when the sending buffer can be refilled from a hard disk. If you are using a camera, and you are told to stop sending, all is good if and only if the buffer with the part that did not make it through at that transition is still in the buffer. Unfortunately, the camera does not stop taking frames and adding data to the buffer…either new data or data from the moment of start/stop transition must be lost. Something sent over the wire is guaranteed to have proper order and integrity, but the data at the camera end buffer will be lost as soon as transition is not handled correctly or if stop time is too long and the new data frames overrun the buffer. Camera buffer is distinguished from host USB buffer, but a larger host USB buffer means the camera is going to have more time available to it in which it is allowed to send data. More send time available…fewer camera buffer overruns.

Isochronous mode was intended for real time. Isochronous gives a guaranteed regular time slice without stopping and starting. Because the time slice is guaranteed, one cannot halt and resend data at the USB level, and so data is lost instead of being corrected. If you are listening to audio and the gap of lost time is 10 microseconds, a listener won’t ever know this happened. If the data is video, then perhaps a part of a frame will be missing, or an entire frame. Software could reconstruct a missing part of a frame or an entire frame based on prior frame, but this is out of the scope of the USB driver when in isochronous mode. Should the USB transfer always have enough bandwidth, e.g., a USB3 connection talking to a keyboard, there is no possible way the USB layers would ever get overwhelmed from keyboard data. Once you go to something like a camera the required bandwidth approaches the limits of the reserved bandwidth of isochronous mode…but if the consumer of data (your video software) can consume all data fast enough, and if the USB driver buffer does not overrun, you’ll still get 100% correct transfer. The driver itself only gets a certain time slice, so buffer in the port hardware may get overrun even if the system buffer is enormous.

If we were doing network programming, I’d say TCP is incorrect for the situation because the error correction is not tailored to the nature of the data. Instead, one would write a custom reliability layer and use UDP sockets. The same could be done for the camera USB, but doing so would not be easy (nor trivial).

For this particular camera on this particular mini-PCIe USB3 adapter, it seems that the case of lost data (buffer overrun or data lost at the moment the transfer must momentarily halt) is not being handled. Choice one would be to stop the data from being lost; choice two would be to accept the loss and write the software to gracefully handle the loss.

Since the camera is in bulk mode, and there is no way to add buffer at the camera end, one can assume that eventually the camera will lose data even if increasing buffer size in the system (remember, buffering occurs at multiple levels, system side is the only side you can control without changing the USB port buffer itself, which would imply changing the chip soldered to the board). The implication is that the software reading the camera data must be changed to better handle lost data…bulk transfer error handling works for devices with persistent storage, but has limits for devices with only a small amount of cache and no persistent storage.

At this point all I can think of is to investigate whether the mini-PCIe USB3 card has a kernel parameter for USB buffer size and test an increase. I’m uncertain as to how one would set a larger buffer on a third-party add-on USB3 card…perhaps since it uses the same xchi driver the same parameter could be used with some small variation to identify the mini-PCIe version. This would only be mitigating the problem, buffer size increase would not fix the problem.

Thanks a lot for the detailed explanation. I’ll look into the system buffer size, and also might post another question to see if anyone has successfully tested any mini-PCIe USB3 card. Seems like this should be a pretty common configuration.

Hi,

Do you have the PCIE-USB3-card driver for JetsonTK1, I can not make it work with my usb3.0 camera.

Thanks,

dizzy

This page mentions a card that might work.