The USB transfer rate does not meet expectations

Hi
I test USB transfer rate on orin devkit .
The usb devices can enum 10000M sucess.

I use this cmd to test transfer rate
sudo dd if=/dev/sda1 of=/dev/null bs=4M count=256 iflag=direct

The transfer rate is below 800MB/s.

Test this USB device on PC.
the transfer rate can meet expectations(10000Mbit)

How to optimize USB transfer speed????

Hi,
Here are some suggestions for the common issues:

1. Performance

Please run the below command before benchmarking deep learning use case:

$ sudo nvpmodel -m 0
$ sudo jetson_clocks

2. Installation

Installation guide of deep learning frameworks on Jetson:

3. Tutorial

Startup deep learning tutorial:

4. Report issue

If these suggestions don’t help and you want to report an issue to us, please attach the model, command/step, and the customized app (if any) with us to reproduce locally.

Thanks!

These advises do not helpful.
Insert a usb3.1gen2 usb mass device and use this cmd .
sudo dd if=/dev/sda1 of=/dev/null bs=4M count=256 iflag=direct.

This is not an answer, but is instead some information involved with the situation…

First, keep in mind that the USB speed is a raw speed, and that it is in units of “bits” per second (abbreviation for this case 10 Gb/s). Disks and many devices often state speed in “bytes” per second, and often one must distinguish between a power of 10 (such as GB/s; 10 Gb/s is 10/8 GB/s, or 1.25 GB/s). The raw data rate on the USB probably achieves this. Then there will be various kinds of overhead. 800 MB/s is an actual throughput of 6.4 Gb/s. This could see some improvement, but it isn’t unreasonable after overhead.

Next, if your device used for data cannot exceed the data rate, then there is a bottleneck other than USB, and USB cannot magically speed up the disk.

In this case, using dd can also change how cache is used from the disk. dd can in fact completely disable cache; the “bs=4M” is buffer, and so you are stopping to fill up a 4MB buffer with read data prior to sending. It may not actually be predictable how this will change performance, but on anything which sends packets in bursts, it is likely that the best buffer size is to be equal to the packet data size. If the buffer is smaller than the packet, then you have more packets…and more overhead. If your buffer is larger than packet size, then the packet must be broken up and then reassembled. I don’t know what that size is for this case as this is normally a question for wired networking.

Also, you have a root HUB, and that HUB might be connected to more than one end device. What do you see for the output of the command “lsusb -tv”? For your device to achieve the maximum data throughput it has to be the only device touching the root HUB.

Note: “GiB” is a multiple of 1024. “1 GiB” equals 1024 bits, not 1000 bits. You still have to divide by 8 to get bytes.

There is also throughput of other parts of the system which you might be seeing. The USB is purely a data pipe, and if that pipe is sending bursts at the proper clock rate, then pausing which slows down averages can be from other bottlenecks. If you are familiar with DMA transfers, this is why DMA exists. As an example, if you copy from a disk to memory without DMA, then one will more or less saturate a CPU core to make that transfer. The same transfer, if a DMA address is given, bypasses the CPU and sends data directly through the memory controller; the CPU would no longer be limiting performance (but if something else is sending data through the memory controller, then it is still possible there is a competing process).

Some things we don’t know:

  • What competition is there for the USB part? See “lsusb -tv”.
  • Is the process going through a CPU core, or is it using DMA? Consider running an application like htop and seeing of a core use goes up significantly in one case, but not much in another case (having only a small CPU usage increase might suggest DMA is used).
  • I don’t know how to test if the memory controller is saturating.
  • Tell us more about what kind of device sda is.

It is quite possible that the PC is much faster, but finding the reason isn’t as simple as saying that it uses USB and thus must be a USB issue.

I’ll also suggest a verbose lsusb comparison between that from the PC and that from the Jetson. USB has different modes and options. For such a comparison to be practical you would need to limit the lsusb to one device (there’d be too much output if you have a fully verbose lsusb with multiple devices). If you run an ordinary “lsusb”, then you will see an “ID”. An example might show up as something like “0955:7023”, which is a combination of manufacturer ID and product ID within that manufacturer. I don’t know the ID of your disk, but if you run “lsusb”, then you should be able to ID it, and this will be the same on both PC and Jetson (this ID is embedded in the USB device). One could pair an “lsusb -tv” and a fully verbose lsusb (adjust for your ID) on PC and Jetson:

# On PC:
lsusb -tv
sudo lsusb -vvv -d 0955:7023

# Again on Jetson:
lsusb -tv
sudo lsusb -vvv -d 0955:7023

This would allow looking specifically to see if the disk has some difference in setup. You might also find the block size used by the kernel (which can differ from the device block size). Assuming this is “/dev/sda1”:
sudo blockdev --getbsz /dev/sda1
(if both systems use the same block size, then you can assume that this is not part of the reason)

You can also check if this differs between PC and Jetson:
sudo hdparm /dev/sda1

None of this says there is not a problem with the Jetson, but if there is a problem, then this might narrow down looking for the problem (which could be USB or something unrelated to USB).

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.