Firmware updates like they rolled out when the Grace Hopper first launched with thermal throttling problems, they had compounded improvements rolled out through firmware over the first 4 months. Here’s to hoping by the next few months we’ll see some improvement.
It is very stable so far, and I’m seeing massive improvements in model loading speeds with mmap off (like 5x or more!), but mmap performance is still mediocre.
It’s not officially released by NVidia yet, so if you want to try this new kernel, you have to compile it from the source. So, unless you are an experienced Linux user, I would not recommend doing this.
I’m by no means an expert in Linux and while I’d like to think my Agents are, they sometimes are not :)
So I’ll wait till it’s stable enough for us to adopt, thank you for sharing some of the efficiencies to expect!
My 40Gbps UGREEN arrived this morning and now it’s reading the right average read/write rates.
and accurately on my ports:
/: Bus 001.Port 001: Dev 001, Class=root_hub, Driver=xhci-hcd/1p, 480M
/: Bus 002.Port 001: Dev 001, Class=root_hub, Driver=xhci-hcd/1p, 20000M/x2
|__ Port 001: Dev 007, If 0, Class=Mass Storage, Driver=uas, 20000M/x2
Interestingly enough although it’s the same brand and the same SSD this time I’m not seeing the name spelled out under disk or Device @Neurfer I wonder if it’s the JL chip that allowed the previous read out, regardless it was the wrong fit for my needs.
I am seeing the same behavior, even using the exact same recommendations for enclosure and disks:
When first connecting the USB to the Dell Pro Max with GB10, I get ~1.3 Gb/s read, 1.8 Gb/s write (using fio instead of the GUI).
I just came in today, and the enclosure is still connected to the system, and the drive with the filesystem remained mounted.
However, it switched to the USB 480 Mb/s bus, and gets about 40 Mb/s now.
Rebooting the system didn’t help.
I had to unmount the filesystem, and disconnect/reconnect the enclosure (same orientation of cable).
I am wondering if I would get better luck going through a Caldigit TB4 plus (which supports USB-C).
Given that the GB10 only has 3 usable USB ports, a TB hub would be the best way to get additional ports.
Disk: 4 TB WD Black SN850X
Enclosure:
Using a standard TB4 dock doesn’t provide the true USB 3.2 2x2 speeds (20 gbs), just the USB 3.2 speeds (10 gbs).
It’s too bad they didn’t put a single USB4 port on the back of the GB10, it would have solved a ton of problems.
It’s depends on which chip is used: 1. JHL 7440 based enclosure will give only 10 gbps, ASM based enclosure can give 20 gbps. And there’s no point of buying more expensive 40 gbps or 80 gbps. In fact, they run even hotter. The major heat coming from isn’t just NVMe module, but that controller chip generate more heats when the transfer rate get faster. But even with 20 or 10 gbps enclosure, they end up throttling down to 40 Mbps when ever I try to transfer +100 GB model or just doing nothing sitting there for +30 min. So at this time, the external NVMe enclosures are for just backup only.
Please share if any of you found a NVMe enclosure + module combo with consistent 20 gbps (R/W 2GB/s) or even 10 gbps (R/W 1GB/s) without external fan w/ sound of jet engine.
I have been experimenting with various fixes:
Using external TB docks (tried CalDigit and Dell)
Added a Linux quirk to disable UAS for the device.
I have not found a true solution yet, but I am fairly certain it is not a thermal issue, but Linux.
I may through Windows for Arm on the system to see if the issue reproduces
Adding the Linux quirk seems to have resolved the issue:
read_throughput: (g=0): rw=read, bs=(R) 1024KiB-1024KiB, (W) 1024KiB-1024KiB, (T) 1024KiB-1024KiB, ioengine=libaio, iodepth=64
….
….Run status group 0 (all jobs):
READ: bw=1255MiB/s (1316MB/s), 1255MiB/s-1255MiB/s (1316MB/s-1316MB/s), io=24.8GiB (26.6GB), run=20214-20214msecDisk stats (read/write):
sda: ios=27320/2, sectors=55951360/24, merge=0/1, ticks=40534/2, in_queue=40535, util=80.15%
———————————-
write_throughput: (g=0): rw=write, bs=(R) 1024KiB-1024KiB, (W) 1024KiB-1024KiB, (T) 1024KiB-1024KiB, ioengine=libaio, iodepth=64
…
fio-3.36
……
WRITE: bw=1461MiB/s (1531MB/s), 1461MiB/s-1461MiB/s (1531MB/s-1531MB/s), io=28.8GiB (30.9GB), run=20205-20205msecDisk stats (read/write):
sda: ios=0/31950, sectors=0/65417624, merge=0/298, ticks=0/40642, in_queue=40641, util=82.22%
$ lsusb -tv|m
/: Bus 001.Port 001: Dev 001, Class=root_hub, Driver=xhci-hcd/1p, 480M
ID 1d6b:0002 Linux Foundation 2.0 root hub
/: Bus 002.Port 001: Dev 001, Class=root_hub, Driver=xhci-hcd/1p, 20000M/x2
ID 1d6b:0003 Linux Foundation 3.0 root hub
|__ Port 001: Dev 003, If 0, Class=Mass Storage, Driver=usb-storage, 20000M/x2
/: Bus 001.Port 001: Dev 001, Class=root_hub, Driver=xhci-hcd/1p, 480M
ID 1d6b:0002 Linux Foundation 2.0 root hub
/: Bus 002.Port 001: Dev 001, Class=root_hub, Driver=xhci-hcd/1p, 20000M/x2
ID 1d6b:0003 Linux Foundation 3.0 root hub
|__ Port 001: Dev 003, If 0, Class=Mass Storage, Driver=usb-storage, 20000M/x2
ID 174c:2463 ASMedia Technology Inc.
cat /etc/modprobe.d/blacklist-uas.conf
options usb_storage quirks=174c:2463:u
update-initramfs ……
update-grub …
ID 174c:2463 ASMedia Technology Inc.
dmesg
[ +0.001988] usb 1-1: UAS is ignored for this device, using usb-storage instead
[ +0.000005] usb-storage 1-1:1.0: USB Mass Storage device detected
[ +0.000365] usb-storage 1-1:1.0: Quirks match for vid 174c pid 2463: 800000
[ +0.000051] scsi host0: usb-storage 1-1:1.0
#uptime
22:23:30 up 2 days, 5:28, 2 users, load average: 0.31, 0.31, 0.41
__ Port 001: Dev 003, If 0, Class=Mass Storage, Driver=usb-storage, 20000M/x2
The hardware peripheral configuration is simple:
UGreen Enclosure with WD850x (4 TB) directly attached to Dell GB10 rear USB port.
I am using a thunderbolt 3 cable, just because the quality is higher then standard USB-C
I will monitor the output of this for awhile:
For those who appreciate a graphical depiction:
lsusb -ttv|grep -B3 ASMedia
/: Bus 002.Port 001: Dev 001, Class=root_hub, Driver=xhci-hcd/1p, 20000M/x2
ID 1d6b:0003 Linux Foundation 3.0 root hub
|__ Port 001: Dev 003, If 0, Class=Mass Storage, Driver=usb-storage, 20000M/x2
ID 174c:2463 ASMedia Technology Inc.
```
I am about to test this newer enclosure (no fan).
I will test without UAS to determine once and for all that that is the root cause, not thermal throttling.
20Gbps Lightning-Fast Transfers: This M2 enclosure delivers up to 20Gbps transfer speeds over USB 3.2 Gen 2x2 (20Gbps) connections
mdadm RAID over USB on DGX Spark: after reboot disks fall back to USB2 (480 Mbps) → defensive workaround + open question
I wanted to document a recurring issue after reboot with NVMe drives in USB enclosures on DGX Spark, and the practical workaround I ended up implementing to avoid data corruption — in case it helps others, or someone has a cleaner solution.
🧩 Observed problem
This system uses an mdadm RAID array (/dev/md0) mounted at /mnt/raid-modelos, built from two NVMe drives in USB enclosures.
-
On cold boot or normal hot-plug:
-
Devices negotiate correctly at USB 3.x (≥ 5000 Mbps)
-
RAID assembles and mounts without issues
-
-
After some reboots, intermittently:
-
The same devices enumerate as USB2 (480 Mbps)
-
Performance collapses
-
Assembling/mounting the RAID in this state is unsafe (timeouts, resets, corruption risk)
-
This looks like a USB enumeration / power / timing issue during boot, not thermal throttling and not an mdadm problem per se.
🛡️ Implemented solution (defensive)
I decided to never mount the RAID unless both disks negotiate at least USB3 speed.
High-level behavior:
-
The RAID is only assembled and mounted if all member devices report ≥ 5000 Mbps
-
If, after reboot, devices show up as USB2:
-
❌ RAID is not assembled
-
❌ Nothing is mounted
-
✅ System remains in a safe, non-corrupting state
-
Everything is automated via udev + systemd.
🔧 How it works (summary)
-
udev detects the USB devices by stable by-id / serial
-
udev triggers a systemd service
-
A small control script:
-
checks actual link speed via
/sys/.../speed -
if speed ≥ threshold:
-
mdadm --assemble -
mount
/mnt/raid-modelos
-
-
otherwise:
- leaves the array stopped and unmounted
-
🛠️ Daily commands
sudo raid-modelos status # RAID + USB speed + mount status
sudo raid-modelos ensure # safe mount (USB >= threshold)
sudo raid-modelos start # forced mount (not recommended)
sudo raid-modelos stop && sync
Quick diagnostics:
lsusb -t
cat /proc/mdstat
mdadm --detail /dev/md0
findmnt /mnt/raid-modelos
📁 Key files involved
-
Main control script:
/usr/local/sbin/raid-modelos -
Config (USB speed threshold, by-id, waits):
/etc/raid-modelos.conf -
udev rule:
/etc/udev/rules.d/99-raid-modelos.rules -
systemd service:
/etc/systemd/system/raid-modelos-ensure.service -
mdadm config:
/etc/mdadm/mdadm.conf -
fstab entry with
noauto
⚙️ Useful tuning knobs
MIN_USB_SPEED_MBPS=5000 # default
MIN_USB_SPEED_MBPS=0 # allow mount even at 480 Mbps (NOT recommended)
WAIT_PARTITIONS_SEC=...
❓ Open question
Has anyone found a better or more root-cause solution to prevent, after reboot:
-
USB-C / USB4 / Thunderbolt devices
-
from falling back to USB2 (480 Mbps)?
I’d be especially interested in experience with:
-
kernel / xHCI parameters
-
power-management quirks
-
USB enclosure firmware differences
-
USB4 vs TB3/TB4 behavior
-
cleaner ways to delay USB enumeration during boot
The current approach is safe and works well, but it’s clearly a defensive workaround rather than a true fix.
Any insight or alternative approach would be very welcome.


