Tegra-xusb: SuperSpeed device reconnects continuously, "Set TR Deq Ptr cmd failed" on Orin NX R36.5

Hi NVIDIA team,

We are seeing a SuperSpeed USB device on our Jetson Orin NX reconnect

continuously without any userspace action or physical-layer event. Over

~35 minutes the kernel allocated 347 new device numbers on the same

physical port, accompanied by 24 tegra-xusb warnings. We’ve

narrowed the trigger down to within the host-side kernel stack: the

peripheral device itself and the Type-C physical layer show no events

during the reconnects. We’d like guidance on whether this signature is

known and what the recommended mitigation is.

Environment

| Platform : NVIDIA Jetson Orin NX Engineering Reference Developer Kit |

| L4T Release : R36.5.0 (JetPack 6.2) |

| BSP Build : nvidia-l4t-core 36.5.0-20260115194252 / GCID 43688277 |

| Kernel : 5.15.185-tegra |

| USB host controller : tegra-xusb 3610000.usb |

| Peripheral : Custom USB 3.0 bulk gadget, VID:PID 1d6b:0104, connected on bus 2, port 2 via Type-C |


$ lsusb -t

/: Bus 02.Port 1: Dev 1, Class=root_hub, Driver=tegra-xusb/4p, 10000M

|__ Port 2: Dev 49, If 0, Class=Vendor Specific Class, Driver=, 5000M

Symptom

On usb 2-2, the device is disconnected and re-enumerated repeatedly,

sometimes multiple times per second. Each iteration consumes a new

device number:


[Wed Apr 22 19:33:33 2026] usb 2-2: USB disconnect, device number 28

[Wed Apr 22 19:33:33 2026] tegra-xusb 3610000.usb: WARN Set TR Deq Ptr cmd failed due to incorrect slot or ep state.

[Wed Apr 22 19:33:34 2026] usb 2-2: new SuperSpeed USB device number 29 using tegra-xusb

[Wed Apr 22 19:33:41 2026] usb 2-2: USB disconnect, device number 29

[Wed Apr 22 19:33:41 2026] usb 2-2: new SuperSpeed USB device number 30 using tegra-xusb

[Wed Apr 22 19:33:42 2026] usb 2-2: USB disconnect, device number 30

[Wed Apr 22 19:33:43 2026] usb 2-2: new SuperSpeed USB device number 31 using tegra-xusb

[Wed Apr 22 19:33:44 2026] usb 2-2: USB disconnect, device number 31

[Wed Apr 22 19:33:44 2026] tegra-xusb 3610000.usb: WARN Set TR Deq Ptr cmd failed due to incorrect slot or ep state.

[Wed Apr 22 19:33:44 2026] usb 2-2: new SuperSpeed USB device number 32 using tegra-xusb

Two warning signatures observed:

  1. tegra-xusb 3610000.usb: WARN Set TR Deq Ptr cmd failed due to incorrect slot or ep state. (22 occurrences)

  2. tegra-xusb 3610000.usb: WARN Event TRB for slot 3 ep 1 with no TDs queued? (2 occurrences)

Over a 35-minute window:

  • usb 2-2 disconnect/reconnect events: 347

  • tegra-xusb WARN lines: 24

  • Device numbers allocated: 2 → 49+ (no cleanup)

Why the trigger appears to be on the host side, not the peripheral or cabling

We did a two-sided investigation. The observations below rule out the

peripheral and the Type-C physical layer, which leaves the host-side

kernel stack as the most likely origin.

1. Type-C CC controller (fusb301) is silent

On both the Jetson side and the peripheral side, the fusb301 Type-C

controller only logs the initial attach sequence at boot — no detach

or re-attach events during the reconnect storm.

Jetson side fusb301 events — only at boot, then silent:


[Wed Apr 22 19:08:59 2026] fusb301 1-0025: fusb301_detach: type[0x00] chipstate[0x01]

[Wed Apr 22 19:08:59 2026] fusb301 1-0025: fusb_update_state: 1

[Wed Apr 22 19:09:00 2026] fusb301 1-0025: fusb_update_state: a

[Wed Apr 22 19:09:00 2026] fusb301 1-0025: fusb_update_state: b

[Wed Apr 22 19:09:00 2026] fusb301 1-0025: fusb_update_state: 7 <-- last event; stable from here

⇒ The physical Type-C CC layer is stable. The reconnects are not

caused by cable/connector issues.

2. Peripheral gadget was never torn down

On the peripheral side (a separate Linux SoC running FunctionFS):

  • /sys/kernel/config/usb_gadget/*/UDC mtime has not changed since

boot — userspace never unbound the gadget.

  • The peripheral’s EP0 never observed a DISABLE event.

  • FunctionFS URBs remained in_flight=8 throughout the storm — no

ESHUTDOWN was propagated to the gadget.

  • The peripheral’s current_speed reported super-speed consistently.

⇒ From the peripheral’s point of view, VBUS was continuous and the

pullup was continuous. The disconnects are not caused by the peripheral

dropping off.

3. No userspace action on the host

Our Rust USB daemon on the Jetson only calls claim_interface() after

opening the device, then uses bulk endpoints. It does not issue

set_configuration, set_interface, reset_device, or clear_halt

in the normal path. The disconnect is initiated entirely by

tegra-xusb in kernel space, not by any userspace syscall.

4. Throughput behavior and self-healing (observations)

Expected throughput on this setup would be on the order of ~90 MB/s

for bulk transfer. In practice we have never reached that figure

typical observed throughput is a few hundred KB/s, occasionally

bursting to ~20 MB/s, and the reconnect storm keeps happening at any

load level, not only during idle periods. So we cannot attribute the

storm to link idleness.

We have also observed long (30+ minute) self-healing intervals where

the storm spontaneously stops without any intervention, then resumes

later. We have not yet found a reliable external trigger that

correlates with entering or leaving these stable periods.

Attempted mitigations

We tried the usual userspace knobs with no effect:

  • /sys/bus/usb/devices/2-2/power/control set to on — no change.

  • /sys/bus/usb/devices/2-2/power/autosuspend_delay_ms = 2000

(default) — not a factor.

Questions

  1. Known issue? Is `WARN Set TR Deq Ptr cmd failed due to incorrect

slot or ep stateontegra-xusb 3610000.usb` a known issue on

JetPack 6.2 / L4T R36.5 for Orin NX? Any existing bug ID or fix in

a newer BSP?

  1. Root cause direction. Given that the peripheral and CC layer

are quiescent while the host kernel keeps reporting

incorrect slot or ep state on ep 1 of slot 3, what is the most

likely internal state-machine path to investigate first? E.g. Set

TR Dequeue Pointer command timing vs. endpoint state transitions,

LPM U1/U2 handling, or something else specific to Orin NX?

  1. Tuning knobs to try now. Are there Orin NX / tegra-xusb

specific knobs we can change without a kernel rebuild —

  • kernel cmdline parameters in /boot/extlinux/extlinux.conf,

  • device-tree property overrides,

  • or xhci_hcd / tegra-xusb module parameters —

that you would recommend we test first to narrow this down?

Related (but distinct) prior observation

On the same platform and kernel, in April 2026 we previously saw a

different failure signature with a different ERROR-level message:


xhci-tegra 3610000.xhci: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 6 comp_code 4

xhci-tegra 3610000.xhci: Looking for event-dma ... trb-start ... trb-end ...

(trb-end monotonically increasing, never catching up to event-dma)

That mode bricked the UDC state machine on the peripheral side

(functionfs returning EAGAIN, only a full reboot recovered). The

current issue is clearly different — a WARN Set TR Deq Ptr message

rather than a TRB-DMA-mismatch ERROR, and the peripheral stays fully

operational (URBs stay in-flight, functionfs never returns EAGAIN,

a daemon restart — not a reboot — is enough to clear it).

We mention it because both failure modes appeared while the peripheral

was actively streaming bulk data up to the Jetson (device → host),

which is the primary traffic pattern of our recording workload. So the

two may share a common origin in the Orin NX xhci endpoint/slot

state-machine handling of that direction.

Reproducibility and follow-up

We can reproduce this on demand with the same kernel/BSP/peripheral.

Happy to provide any of the following if helpful:

  • Longer dmesg (boot → 1+ hour runtime)

  • ftrace capture of xhci_hcd:* events during a reconnect burst

  • Test with a different SuperSpeed peripheral (e.g., an off-the-shelf

USB3 flash drive) to see whether the warning signature is

peripheral-specific

  • Test on USB2 speed only (gadget advertising only FS/HS descriptors)

Attachment: jetson_dmesg_runtime_only.txt — trimmed dmesg containing

platform info, xhci boot probe, and the full runtime event stream

(347 usb 2-2 events + 24 tegra-xusb WARN lines over 35 minutes).

Thanks in advance for any pointers.

Hi,
Do you use developer kit(Orin NX module + Orin Nano carrier board) and observe the issue? If yes, please share the steps for reference. WE may see if we can replicate the same on developer kit.

Is this still an issue to support? Any result can be shared?