Hi, Ive posted this two months ago but didn’t get any response, any help would be appriciated.
I have the following setup with my jetson AGX Xavier:
My windows PC is connected to the jetson agx via usb (gadget port) and I have an SSD which I connected inside using the m.2 nvme interface mapped to be the backing storage of the mass storage device.
Now I have been running some benchmarking tests using Winsat to see what throughput I can get, and I encountered a recurring bug in which the transfer stops for 20 seconds at random times and then continues as nothing happened.
the command I run on the PC is the following (in cmd admin command prompt):
winsat disk -seq -write -drive E: -seqsize 524288 -v
And this is the output:
Run [1] Type[0x02000001] Zone[0] - 197.169459 MB/s
Run [2] Type[0x02000001] Zone[0] - 22.025154 MB/s
Run [3] Type[0x02000001] Zone[0] - 197.241335 MB/s
Run [4] Type[0x02000001] Zone[0] - 195.392863 MB/s
Run [5] Type[0x02000001] Zone[0] - 197.208295 MB/s
Run [6] Type[0x02000001] Zone[0] - 197.227573 MB/s
Run [7] Type[0x02000001] Zone[0] - 197.226710 MB/s
Run [8] Type[0x02000001] Zone[0] - 22.028424 MB/s
I had tweaked the kernel a bit to debug this problem, adding prints on reads and writes, this is the dmesg I get during the disconnection (when the transfer pace according to Winsat decreases):
[ +0.002897] my_do_write!
[ +0.002813] my_do_write!
[ +20.001554] android_work: send uevent USB_STATE=DISCONNECTED
[ +0.000191] tegra-xudc-new 3550000.xudc: ep 3 disabled
[ +0.000057] tegra-xudc-new 3550000.xudc: ep 2 disabled
[ +0.000224] configfs-gadget gadget: super-speed config #1.c
[ +0.000134] android_work: sent uevent USB_STATE=CONNECTED
[ +0.000070] android_work: sent uevent USB_STATE=CONFIGURED
[ +0.000038] tegra-xudc-new 3550000.xudc: ep 3 (type: 2, dir: in) enabled
[ +0.000018] tegra-xudc-new 3550000.xudc: ep 2 (type: 2, dir: out) enabled
[ +0.003475] my_do_write!
[ +0.001192] my_do_write!
now i opened wireshark on the usb interface and what i see is that a write request goes out at some time and from that point in time for the next 20 seconds
all I see are “URB_INTERRUPT in” messages at 32 and 27 frame size going in and out of the device 100 times a second for roughly 16 seconds around the timeframe the connection drop, as far as I can tell the PC doesn’t drop the connection so I suppose it is probably the jetson.
do you know what might cause this? is there any way to fix it?
thanks a bunch.
some more info:
- The bug happens in do_write in drivers/usb/gadget/function/f_mass_storage.c
- It seems like sleep_thread is called, then there is a 20 sec sleep and then raise_exception is called.
- Then do_set_interface is called and the connection resets.
so I’m not entirely sure what happens, but what I think is that the acknowledge message that the Jetson sends
that lets the host know that it’s ready to receive the next data block somehow doesn’t arrive at the host. Then after 20 seconds the host resets the connection because it has a packet that didn’t get acknowledged for this amount of time.
I’ve looked a little bit into the Tegra UDC controller and I’ve seen that there are some changes in clock speed down, maybe this has something to do with that?
I’m not sure how to proceed with this, any help would be appreciated.
thanks a bunch