Total USB 3.2 Bandwidth on Orin AGX DevKit

wegr · August 26, 2024, 7:47am

Hi,

We are planning to use the AGX Orin as a video aquisition platform in the field, to capture multiple USB-Cameras at once. Even though we are developing our NX-Based hardware for this task, the software development currently happens on the AGX Orin DevKits. The USB-Layout on NX and AGX platforms seems to be equivalent anyway.

According to the documentation, the XUSBC core on the Orin has two (seemingly) independent 10 Gbit/s USB3 ports, that get split with a root-hub at each port to generate in total four ports (of which only three are routed off the NX and AGX modules).

From this layout, it was expected that it should be possible to connect two USB Cameras to the Orin.

We are using a Stereolabs ZED mini Stereo-Camera and a Microsoft Azure Kinect ToF/RGB Camera. Both cameras are rated for a USB 3.0 (5GBit/s) interface and should (in theory) even work sharing one of the two “main” ports. Nevertheless, connecting both to the same port causes even more problems, so we opted for separating them as much as possible. A drawing on how they are connected is attached.

As confirmation that everything shows up as desired, the lsusb print as follows: (The “Realtek Semiconductor”-USB-Hub is the fitted on the Orin Devkit)

lsusb -tv

/:  Bus 02.Port 1: Dev 1, Class=root_hub, Driver=tegra-xusb/4p, 10000M
    ID 1d6b:0003 Linux Foundation 3.0 root hub
    |__ Port 1: Dev 6, If 0, Class=Video, Driver=uvcvideo, 5000M
        ID 2b03:f682 STEREOLABS ZED-M camera
    |__ Port 1: Dev 6, If 1, Class=Video, Driver=uvcvideo, 5000M
        ID 2b03:f682 STEREOLABS ZED-M camera
    |__ Port 3: Dev 2, If 0, Class=Hub, Driver=hub/4p, 10000M
        ID 0bda:0420 Realtek Semiconductor Corp. 
        |__ Port 3: Dev 3, If 0, Class=Hub, Driver=hub/2p, 5000M
            ID 045e:097a Microsoft Corp. 
            |__ Port 1: Dev 4, If 2, Class=Vendor Specific Class, Driver=, 5000M
                ID 045e:097d Microsoft Corp. 
            |__ Port 1: Dev 4, If 0, Class=Video, Driver=uvcvideo, 5000M
                ID 045e:097d Microsoft Corp. 
            |__ Port 1: Dev 4, If 1, Class=Video, Driver=uvcvideo, 5000M
                ID 045e:097d Microsoft Corp. 
            |__ Port 2: Dev 5, If 0, Class=Vendor Specific Class, Driver=, 5000M
                ID 045e:097c Microsoft Corp. 
/:  Bus 01.Port 1: Dev 1, Class=root_hub, Driver=tegra-xusb/4p, 480M
    ID 1d6b:0002 Linux Foundation 2.0 root hub
    |__ Port 2: Dev 8, If 0, Class=Human Interface Device, Driver=usbhid, 12M
        ID 2b03:f681 STEREOLABS ZED-M HID Interface
    |__ Port 3: Dev 3, If 0, Class=Wireless, Driver=rtk_btusb, 12M
        ID 13d3:3549 IMC Networks 
    |__ Port 3: Dev 3, If 1, Class=Wireless, Driver=rtk_btusb, 12M
        ID 13d3:3549 IMC Networks 
    |__ Port 4: Dev 4, If 0, Class=Hub, Driver=hub/4p, 480M
        ID 0bda:5420 Realtek Semiconductor Corp. 
        |__ Port 3: Dev 6, If 0, Class=Hub, Driver=hub/4p, 480M
            ID 045e:097b Microsoft Corp. 
            |__ Port 3: Dev 7, If 1, Class=Audio, Driver=snd-usb-audio, 480M
                ID 045e:097e Microsoft Corp. 
            |__ Port 3: Dev 7, If 2, Class=Human Interface Device, Driver=usbhid, 480M
                ID 045e:097e Microsoft Corp. 
            |__ Port 3: Dev 7, If 0, Class=Audio, Driver=snd-usb-audio, 480M
                ID 045e:097e Microsoft Corp. 
        |__ Port 2: Dev 5, If 2, Class=Human Interface Device, Driver=usbhid, 12M
            ID 046d:c52b Logitech, Inc. Unifying Receiver
        |__ Port 2: Dev 5, If 0, Class=Human Interface Device, Driver=usbhid, 12M
            ID 046d:c52b Logitech, Inc. Unifying Receiver
        |__ Port 2: Dev 5, If 1, Class=Human Interface Device, Driver=usbhid, 12M
            ID 046d:c52b Logitech, Inc. Unifying Receiver

Internally, the Kinect has a hub connecting the different cameras and microphones to the USB, while the ZED opens two interfaces to the same USB.
Having connected both cameras but starting only one seems to not cause any troubles - each camera works like a charm on their max settings. The trouble only begins if both cameras record at the same time.
While the Azure Kinect seems to work as intended, the ZED mini gets starved of bandwidth. Lowing both camera’s resolutions to 720P works and allows both cameras to stream their 30FPS, but FullHD completely fails.

Test Case	Status
Single ZEDmini	OK
Single Azure Kinect	OK
Both (FullHD)	ZEDmini fails with “Connection Issue” / “Low USB bandwidth”
Both (ZEDmini@FullHD, AzureKinect@720p)	Stuttering on ZEDmini (inconsistent Frame rate + weird vertical screen-tearing)
Both (ZEDmini@720P, AzureKinect@FullHD)	Stuttering on ZEDmini - but less harsh
Both (720P)	OK

I partially also blame the ZEDmini for being some sort of Diva, but I’m puzzled that a (in total) 20Gbit/s node struggles to pump a 4GBit/s Signal (2x 1080p30) and a 3GBit/s Signal (1x 1080p30 + ToF) simultanesously.

Connecting both cameras to the USB-A ports causes even more problems, but that is somewhat expected.

DaneLLL · August 26, 2024, 11:17am

Hi,
Please check
connected more than two usb cameras problem on deepstream-app (Jetson Nano Dev Kit) - #12 by DaneLLL
4 USB cameras using same USB Bus - #3 by DaneLLL

Please avoid using the roothubs sharing same bandwidth. And if throughput of the USB device is 5Gbps, the maximum throughput is 5Gbps.

wegr · August 26, 2024, 11:37am

Hello DaneLLL,
Thank you for your response - this is exactly what I do. I do not use the USB RootHubs. The four USB-A Connectors share 10GBit/s via the 4-Way-USB-Hub on the Devkit’s PCB while the two USB-C ports share another 10GBit/s.
That’s why I am puzzled. Are there some known scheduling restrictions inside the XUSBC-Core’s xHCI-Controller? According to the documentation, these two Root-Hub “Main Ports” should be fairly independent.

It is also puzzling that only the ZED mini gets starved - the Azure Kinect runs just fine.

There are no messages on the dmesg and no changes in lsusb when forcing a state where the ZED starts to fail.

linuxdev · August 26, 2024, 4:59pm

FYI, you might include the specific “lsusb -tv” for any particular scenario you want to describe.

I could be wrong about this, but I want to describe kernel scheduling, which is related to this.

Whenever actual hardware needs servicing there is a hardware interrupt issued. This is an actual wire. The scheduler is then what determines what happens, e.g., to run driver on a specific core and at what priority. To run two drivers, or the same driver in two independent cores, several things have to happen:

The wiring has to allow the communications to reach any related core.
If the same core is used, then the two drivers must take turns.
Whatever consumes the data must be able to consume fast enough to not result in buffer overruns.

I think you will find that the root HUBs must all use the same CPU0 core (someone correct me if this is wrong). this in turns means that although the average throughput of the root HUB might be something like 10 Gb/s, that each 5 Gb/s must take turns. If one device uses too much time, then either the next device will suffer or the first device will be preempted by the scheduler. They are sharing not just HUB bandwidth, but also they are sharing time slices.

If you have two separate root HUBs, then perhaps buffering gives you an advantage between two HUBs that are sharing CPU0. What happens if CPU0 comes under load? Maybe the HUB is doing what it should but data is lost due to buffer overrun.

I couldn’t give you details, but perhaps you could set the particular USB device drivers at a slightly higher priority. This would somewhat decrease competition from other drivers on that core. That might not work well for what you want though because it implies something else is going be slower.

wegr · August 27, 2024, 5:59am

Hello linuxdev,

Thank you for your answer.
There is only one hardware-scenario I am describing - there is no change in lsusb -tv between the different scenarios. The only changes are the activities of the attached cameras.

To start the cameras, I use two separate cmd-line commands - both invoked by ROS2 (with the Nvidia ISAAC ROS and the official ROS-Drivers from Stereolabs for ZED and Microsoft for Azure Kinect).

The schematics and the SoC Technical Reference Manual indicate, that there are two root-hubs (for 2 “main ports”) expanding the two main ports to 4 actual Ports at the SoC-Border (of which only 3 are routed off the Jetson Modules).
Each of the two main ports (and of the two root hubs) serve one camera each. That’s why I am puzzled.

Your theory regarding hardware interrupts makes sense - maybe there is a memcopy (or something similar) taking too long. But there is not one single CPU-Core being under heavy load (sometimes one of the 12 Cores goes up to about 15% with the rest sitting below 10%).

Having both Cameras plugged like that:

$ lsusb -tv
/:  Bus 02.Port 1: Dev 1, Class=root_hub, Driver=tegra-xusb/4p, 10000M
    ID 1d6b:0003 Linux Foundation 3.0 root hub
    |__ Port 1: Dev 6, If 0, Class=Video, Driver=uvcvideo, 5000M
        ID 2b03:f682 STEREOLABS ZED-M camera
    |__ Port 1: Dev 6, If 1, Class=Video, Driver=uvcvideo, 5000M
        ID 2b03:f682 STEREOLABS ZED-M camera
    |__ Port 3: Dev 2, If 0, Class=Hub, Driver=hub/4p, 10000M
        ID 0bda:0420 Realtek Semiconductor Corp.
        |__ Port 3: Dev 3, If 0, Class=Hub, Driver=hub/2p, 5000M
            ID 045e:097a Microsoft Corp.
            |__ Port 1: Dev 4, If 2, Class=Vendor Specific Class, Driver=, 5000M
                ID 045e:097d Microsoft Corp.
            |__ Port 1: Dev 4, If 0, Class=Video, Driver=, 5000M
                ID 045e:097d Microsoft Corp.
            |__ Port 1: Dev 4, If 1, Class=Video, Driver=, 5000M
                ID 045e:097d Microsoft Corp.
            |__ Port 2: Dev 5, If 0, Class=Vendor Specific Class, Driver=, 5000M
                ID 045e:097c Microsoft Corp.
/:  Bus 01.Port 1: Dev 1, Class=root_hub, Driver=tegra-xusb/4p, 480M
    ID 1d6b:0002 Linux Foundation 2.0 root hub
    |__ Port 2: Dev 8, If 0, Class=Human Interface Device, Driver=usbfs, 12M
        ID 2b03:f681 STEREOLABS ZED-M HID Interface
    |__ Port 3: Dev 3, If 0, Class=Wireless, Driver=rtk_btusb, 12M
        ID 13d3:3549 IMC Networks
    |__ Port 3: Dev 3, If 1, Class=Wireless, Driver=rtk_btusb, 12M
        ID 13d3:3549 IMC Networks
    |__ Port 4: Dev 4, If 0, Class=Hub, Driver=hub/4p, 480M
        ID 0bda:5420 Realtek Semiconductor Corp.
        |__ Port 3: Dev 6, If 0, Class=Hub, Driver=hub/4p, 480M
            ID 045e:097b Microsoft Corp.
            |__ Port 3: Dev 7, If 1, Class=Audio, Driver=snd-usb-audio, 480M
                ID 045e:097e Microsoft Corp.
            |__ Port 3: Dev 7, If 2, Class=Human Interface Device, Driver=usbhid, 480M
                ID 045e:097e Microsoft Corp.
            |__ Port 3: Dev 7, If 0, Class=Audio, Driver=snd-usb-audio, 480M
                ID 045e:097e Microsoft Corp.
        |__ Port 2: Dev 5, If 2, Class=Human Interface Device, Driver=usbhid, 12M
            ID 046d:c52b Logitech, Inc. Unifying Receiver
        |__ Port 2: Dev 5, If 0, Class=Human Interface Device, Driver=usbhid, 12M
            ID 046d:c52b Logitech, Inc. Unifying Receiver
        |__ Port 2: Dev 5, If 1, Class=Human Interface Device, Driver=usbhid, 12M
            ID 046d:c52b Logitech, Inc. Unifying Receiver

Then starting the ZEDmini with the following command:

ros2 launch zed_wrapper zed_camera.launch.py camera_model:=zedm

This alone works fine - the ZED-Mini streams and the (stutter free) image is visible in RVIZ2. (the ROS2-Visualizer)

Opening a second terminal for starting the Azure Kinect individually:

ros2 launch azure_kinect_ros_driver driver.launch.py color_resolution:=1080P depth_mode:=NFOV_UNBINNED fps:=30

This makes the ZEDmini image freeze in RVIZ2, while the ZED-Camera reports:

[zed_wrapper-2] [ERROR] [1724737690.468687269] [zed.zed_node]: Connection issue detected: CAMERA_REBOOTING
[zed_wrapper-2] [ERROR] [1724737691.708659450] [zed.zed_node]: Connection issue detected: CAMERA_REBOOTING
[zed_wrapper-2] [ERROR] [1724737694.454848273] [zed.zed_node]: Connection issue detected: CAMERA_REBOOTING

(this goes on indefinitely)

Then again killing the ZEDmini driver and restarting it, causes this behaviour:

[zed_wrapper-2] [2024-08-27 05:57:22 UTC][ZED][INFO] [Init]  Waiting 728 msec before next opening attempt.
[zed_wrapper-2] [2024-08-27 05:57:23 UTC][ZED][INFO] [Init]  Trying to force a device reboot to recover the video module.
[zed_wrapper-2] [2024-08-27 05:57:25 UTC][ZED][INFO] [Init]  Timeout in 129msec
[zed_wrapper-2] [2024-08-27 05:57:25 UTC][ZED][INFO] [Init]  Opening Camera. Attempt #3
[zed_wrapper-2] [2024-08-27 05:57:26 UTC][ZED][ERROR] [Init]  Unable to capture images. Consider trying a lower resolution and/or FPS
[zed_wrapper-2] [2024-08-27 05:57:26 UTC][ZED][INFO] [Init]  Waiting 1385 msec before next opening attempt.
[zed_wrapper-2] [2024-08-27 05:57:27 UTC][ZED][INFO] [Init]  Trying to force a device reboot to recover the video module.
[zed_wrapper-2] [2024-08-27 05:57:30 UTC][ZED][INFO] [Init]  Camera opening timeout reached.
[zed_wrapper-2] [2024-08-27 05:57:30 UTC][ZED][WARNING] LOW USB BANDWIDTH in sl::ERROR_CODE sl::Camera::open(sl::InitParameters)
[zed_wrapper-2] [WARN] [1724738250.355916612] [zed.zed_node]: Error opening camera: LOW USB BANDWIDTH
[zed_wrapper-2] [INFO] [1724738250.356054079] [zed.zed_node]: Please verify the camera connection

But: The Azure Kinect works just fine in this state - almost as it would hog some resources the ZED cannot access anymore.

Dialing one of the two down to 720p causes the ZED to stutter but not reboot. Dialing both cameras down to 720p works fine.

I am also setting up a forum post in the Stereolabs-Forum - as they might give some insight, what part of the USB-Transmission can lead to this behaviour.
How do I give one of the two drivers a higher priority?

linuxdev · August 27, 2024, 2:57pm

Replying as I read, might result in repeating or being out of order from the best reply…

When you say the “lsusb -tv” does not differ, is this an exact match between conditions whereby the speed listed at the end of the line (e.g., “5000M”, “12M”, so on) do not change? Knowing that the root HUB and tree itself has maintained settings is useful information. This would tend to mean that device modes have not fallen back to different speeds (you’d have to monitor for speed changes since the fallback is more or less not visible without specifically checking; dmesg would of course note whenever a device reenumerates to change speed).

“lsusb -tv” should be considered the authority on whether two root HUBs are used, or just one; this is also the authority on what that root HUB is processing. If you have two devices of speed 5000M, and the root HUB is 10000M, then you have enough bandwidth at all times (latency might still differ between a single or two 5000M devices, but it wouldn’t be significant compared to “normal” operation). Having more root HUBs is not guaranteed to have everything go to the correct USB port; the correct wiring and device tree are required to reach that HUB combination by everything you want in that combination. Again, “lsusb -tv” will describe what actually exists at any moment. So far it sounds like there is no change at all in your “lsusb -tv”, which is important information.

Note that older USB standards, from USB 2 (“480M”) and slower used a single controller (root HUB) for older protocols up to and including USB 2. USB3+ HUBs run independently, and for a USB3 device to fall back to USB2 or slower the root HUB must actually migrate from the USB3 root HUB to the legacy HUB. At that moment “lsusb -tv” will change. An example would be that you plug in something which is rated for USB3, but the cable has insufficient signal quality; initial plugin would show routing to the USB3 root HUB, and a moment later “lsusb -tv” would instead show routing to a legacy HUB. I’m basing all of this on no migration, which means signal qualities are not a part of the problems (power requirements could also cause such a migration, but this too is covered by the non-changing “lsusb -tv”).

Trivia: FYI, an Intel CPU on a desktop PC has what is known as an I/O APIC (Asynchronous Programmable Interrupt Controller). That I/O APIC changes to which CPU core an interrupt goes to, and tends to be programmed by the scheduler depending on policies. AMD does something different, but similar, for its desktop CPU interrupts. Only a small part of the Jetson devices can be migrated to different controllers. If you run this command you can see the hardware interrupt counts, some statistics, and descriptions:
cat /proc/interrupts

The above is a lot of output, you could use “less /proc/interrupts”, but the numbers change each time you look, so “cat” is good for any instant, and “less” is good for viewing a snapshot at a given time.

The scheduler does not necessarily migrate different processes to different cores just to see a balance in CPU load. It is a misconception that splitting everything evenly among cores will cause faster software execution. Desktop PC architecture, and ARM Cortex-A architecture (which is what a Jetson uses) takes advantage of a lot of caching. There are different levels of cache (which cost more or less money and power consumption) at different parts of the CPU memory access. If one takes two independent programs and runs them on two different cores, then the operation would in fact be faster by splitting them to different cores; if, however, we are talking about processes or threads which share some data, and if splitting them across cores implies both needing to access a different cache on a different core, and then to update the other’s cache, there would be a lot of cache misses, and performance would suffer horribly. The scheduler tries to take this into account, but the scheduler might not understand what the user intends.

Jetsons lack the ability for many hardware IRQs to migrate…the wiring simply doesn’t exist. In “/proc/interrupts” you will see a lot of drivers serviced only at CPU0. You could in fact use CPU affinity and try to move those drivers for that hardware to another core. The scheduler would see this, and it would seem to operate, but what would happen is that the process would be migrated by the scheduler. You’d add to the CPU load and latency because the scheduler would try to put this on an invalid core, realize it cannot, and then migrate it back to CPU0 anyway.

However, there are cases where each core has access to a given hardware, e.g., every core has its own timers. There are also cases where entire groups of hardware can migrate, but not individual parts of it (the GPIO controller I think is set up this way; you can’t migrate a specific GPIO to a different core, but I think you can migrate a group of GPIO, a GICv3 device on Orin).

Note at the bottom of “/proc/interrupts” there is a “Rescheduling interrupts”. This can be from trying to route an interrupt to an invalid core, but there is a lot more which can cause this. One process might be lower priority and the same core is used, but execution time is delayed. The most interesting is the “Err” for errors, which should be zero.

Setting your Jetson for max performance with nvpmodel is the most reliable and easiest way to improve latencies (at a slight cost to power consumption). This won’t change default scheduling policy, but it will keep clocks up higher and allow more power consumption and somewhat higher temperatures.

For what follows you probably want to be a bit pickier about the names “process” and “interrupt”. A process ID (PID) is a user space notion, whereas a hardware IRQ is entirely in kernel space. They are both ways of identifying which software goes where, but one would tend to speak of hardware drivers via their IRQ and user space software via their PID. Note that software which operates on particular software operates on the PID, whereas changing the priority of a driver is not the normal situation. However, if a given PID uses a particular part of the hardware which has a given driver on a given IRQ, then changing the scheduler priority of the PID can indirectly change the IRQ priority. The scheduler is not obligated to honor requested changes (if this were hard realtime hardware that would not be the case…ARM Cortex-R hardware is capable of running guaranteed timings and never ever missing).

If you have a program which runs two cameras together, then increasing the priority of the program could (and often will) change the priority of the kernel driver whenever a hardware IRQ is issued. Both cameras would be on one cable, but technically are separate devices unless the internals of the stereo camera has hardware to synchronize timings. It is good if the cameras or devices self-synchronize for stereo; if you tried to get the two cameras to synchronize in timing by controlling the driver at the hardware IRQ level you’d fail. You could tune this and improve it, but ultimately you couldn’t do anything to make this deterministic on this hardware (two separate cameras in stereo would be inferior to on physical device with two cameras that have internal shutter timing sync since the latter can give you deterministic operation of both cameras at the same instant in time even if the hardware IRQ timing splits to two IRQs with slightly different times).

When you are talking about process via a PID the scheduling “pressure” this is the “nice” number. It is called nice because the higher the number the nicer this process is to letting other processes push it out of the way. A nice value of zero is default for user space programs. Anyone can set their program to be “nicer”, but if you want a higher priority (a “negative” nice number), then you must use root authority. Is there a single program which runs the two cameras? Perhaps it runs other things as well, which makes this not work so well, but you could renice (that’s an actual program name) your program to something like a nice level of “-2”.

Beware that you should not simply give processes higher priority at random, there can be unintended consequences, e.g., priority inversions. See:

man nice
man renice

With those you could either start your program with a priority nice of -2, or migrate it to that higher priority. If you go to -5 you’ll probably have unintended consequences. Note that if your program is just one less nice than another, then making it even more negative in nice value isn’t going to help. Regardless of how much priority you give your program, if the CPU core doesn’t have time to do what it needs for both cameras, then more priority isn’t going to magically cause the core to be faster.

For drivers one can also take a program (and indirectly the driver running it; we’re talking to the scheduler) and bind it to a specific CPU core. You won’t have much help from this since I think the wiring you need to migrate the USB to another core doesn’t exist. However, for reference this is CPU “affinity”. One can mark a program or process with a cgroup, and then use various methods to assign that cgroup to a specific core (which will promptly reschedule to CPU0 and waste time going back to CPU0 when no wiring exists to go to your favored core). However, the way CPU affinity could help, is that there are likely a number of software processes (things not requiring wires to hardware run on software IRQs…“soft” IRQs), and soft IRQs can migrate to any core at any time (once again, there may be performance problems from cache misses by doing this). This means you could find software processes on CPU0 and offload them to another core to allow hardware IRQs to work on a less loaded CPU0. I don’t think this is going to be of a huge benefit to you though due to the bandwidth USB3 stereo is pushing.

When building hardware drivers it is considered efficient if the hardware IRQ does the minimum possible work (taking the least time), and then completes any work via reissuing a software IRQ to a different driver. This means the “wired” connection locks the core for less time. Maybe the software IRQ even runs on the same core in order to try to take advantage of cache hits, but it does give the opportunity to preempt and run smoother (parts of your hardware IRQ will be atomic, preempting might be denied for a short time; those times add up).

Regarding your specific “lsusb -tv”…

Your root HUB at bus 2, port 1, device 1, has 10000M bandwidth. The first two items on that root HUB use 5000M bandwidth. This has completely used the bandwidth (many devices don’t transfer data 100% of the time, so it might not actually be that bad, but often it is). Then I see another HUB which is 10000M consuming from that original root HUB. If that HUB runs at the same time as both cameras, then you are guaranteed to have traffic congestion. Just because a HUB is 10000M mode doesn’t mean it will actually use that much bandwidth, e.g., if you plug a mouse into that HUB, then consumption is trivial.

However, if you look at that other 10000M HUB (bus 2, port 3, device 2) from Realtek, I am unable to know what those devices actually are. Most of them are 5000M, and so even if this other non-root HUB had its own separate root HUB, then that HUB all by itself has consumers exceeding its 10000M bandwidth. This 10000M bandwidth is being consumed (for your case ) on a root HUB that already has 10000M from the USB Video class cameras. There is no possibility that USB can provide enough bandwidth if all devices operate at the same time (some devices might buffer and burst, and sometimes work, but on average it will fail).

My conclusion is that most of your problems are from having is due to running everything from a single root HUB which is not even remotely capable of handling all devices simultaneously. That particular “lsusb -tv” does not show the existence of a second USB3 root HUB.

Root HUBs will not automatically route to a given port. You have to (A) have the drivers present (you must have that for one of them to show up), and (B) the drivers have to know to use that port (a device tree setting; this is the firmware telling drivers about binding to hardware at given addresses), and (C) the actual wiring has to exist. For example, if you routed only the legacy USB2 wiring, then no matter what you do, the USB3 root HUB won’t show up. If you did everything correctly on the schematic, but didn’t properly bind things together with the device tree, then the hardware will appear to be missing. If you have the wiring for a USB3 10000M root HUB which you do not see, then it is likely your device tree is incorrect.

If you could actually get a second root USB3 HUB, then you’d still be at the edge of your bandwidth, but you would stand a chance. Currently, giving drivers higher priority would not help, at least not enough to solve the problem. I suggest examining your USB3 wiring and device tree to get a second root HUB of USB3 speed to show up. Perhaps you truncated your “lsusb -tv”, but if not, then there is no chance this will succeed after the first two cameras if you don’t cut down resolution.

wegr · August 28, 2024, 9:08am

Hello,

Thank you for your concise explanation. From that, I made some experiments and measured the number of interrupts and rescheduled interrupts depending on the workload.

Test	USB-Interrupts per second	Total (systemwide) rescheduled interrupts per second
Nothing running	4.7	4.4
ZEDmini @ 1080p	8441.1	6.5
ZEDmini @ 720p	4222.7	4.23
Azure Kinect @ 1080p	1173.7	9.56
Azure Kinect @ 720p	881.4	4.9
Both (2x 720p)	4258.5	102.3
Both (Azure@720p, ZEDmini@1080p)	4850.2	13.1
Both (Azure@1080p, ZEDmini@720p)	5369.2	6.22

From the low CPU load - even if the ZEDmini pushes over 8000 Interrupts per second - I guss, that interrupt latency is not the problem.
Running both at 720p seems to somehow put some strain onto the system, as it suddenly has to reschedule more interrupts. The video seems to run fine though… the system seems to manage.
When raising one camera to 1080p, we see how the numbers stop adding up right. From the “solo-camera”-Numbers, we would expect over 10’000 Interrupts per second - I guess, the USB complex now is somehow overloaded with traffic, so it generates less interrupts than expected.
This would indeed point to a bandwidth limitation of this USB-root-hub.

But this would also mean, that Nvidias Documentation on this topic is wrong. Or am I missing something?

DaneLLL · August 28, 2024, 10:42am

Hi,
The document describes the maximum capability. If you have single device connected to the roothub, it can achieve the maximum throughput. If there are multiple devices connected to the roothub, performance may drop due to concurrent access.

linuxdev · August 28, 2024, 7:44pm

The USB can only update at a certain rate since the data and control do not send infinitely fast. The IRQ is just a side effect of the scheduler beginning to service the hardware driver. If IRQ saturation occurs (it is called “IRQ starvation”), then even rated bandwidth of the USB could not be achieved (perhaps seen as dropped frames if errors are allowed). The IRQ experiment though was a good idea. It still seems correct that bandwidth is just saturated. The system simply cannot handle that much data, it is about double what it is capable of.

I would concentrate on checking your device tree and wiring such that another 10000M root HUB can be enabled. I don’t know the specific hardware, perhaps it doesn’t even have such a controller, but I’m thinking maybe it does. If your device tree were correct though, then even if the root HUB did not have anything connected, it would have shown up in the “lsusb -tv”. “lsusb -tv” will show root HUBs even if nothing routes to them. Step 1: Make the HUB available; step 2, set up routing (both probably being part of device tree, but potentially actual wiring also matters).

system · September 11, 2024, 7:45pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Jetson ORIN AGX USB3 Hub randomly becoming USB2.0 Jetson AGX Orin camera , usb	8	192	August 28, 2024
4 USB cameras using same USB Bus Jetson AGX Orin usb	12	1173	April 11, 2024
Insufficient bandwidth when using multiple USB cameras Jetson AGX Orin camera , usb , gstreamer	11	3489	November 9, 2022
USB camera gets disconnected Jetson Nano camera , usb	11	2303	January 10, 2022
Jetson Orin AGX won't finish boot after upgrade to Jetpack 5.1.2 Jetson AGX Orin boot	21	959	October 5, 2023
[JP36.3] AGX Orin Industrial - USB works only in UEFI Jetson AGX Orin kernel , usb , board-design , device-tree	9	147	September 6, 2024
XavierNX usb3.2 Bandwidth problem Jetson Xavier NX usb	10	760	November 6, 2023
Orin Nano and tc358743 capture issue Jetson Orin Nano camera	101	5180	March 18, 2024
AGX Orin CSI E (Port 4) support Jetson AGX Orin camera	43	148	March 7, 2025
Welcome to the Jetson Projects forum! Jetson Projects	45	9829	November 24, 2024

Total USB 3.2 Bandwidth on Orin AGX DevKit

Related topics