Working configuration: RTX 5080 + Razer Core X V2 (Thunderbolt 5) on Ubuntu 24.04 / kernel 6.17 / driver 590.48.01-open

Sharing a working configuration for RTX 5080 via Thunderbolt 5 eGPU on Linux, since this hardware class is still rough as of driver 590.48.01. This is a workaround recipe that reliably gets the driver loaded and the GPU idling stably on my setup.

Hardware

  • Laptop: Dell Latitude 5540 (Raptor Lake-P)

  • Thunderbolt host: Intel Raptor Lake-P Thunderbolt 4 NHI

  • eGPU enclosure: Razer Core X V2 (USB4 / TB5)

  • GPU: NVIDIA GeForce RTX 5080 (Gigabyte Aorus, PCI ID 10de:2c02)

  • Link: USB4 at 40 Gb/s, 2 lanes × 20 Gb/s (per boltctl list)

Software

  • Ubuntu 24.04 LTS

  • Kernel 6.17.0-20-generic

  • nvidia-driver-590-open (590.48.01)

  • Secure Boot enabled

Symptoms without the configuration below

Every probe attempt failed with:

NVRM: The NVIDIA GPU 0000:03:00.0
NVRM: (PCI ID: 10de:2c02) installed in this system has
NVRM: fallen off the bus and is not responding to commands.
nvidia 0000:03:00.0: probe with driver nvidia failed with error -1

The driver looped probes every ~180ms. lspci showed the GPU as healthy (rev a1, BARs assigned, link at 16GT/s x4 which is expected for TB tunnel), so the GPU was enumerated but inaccessible to the driver at probe time.

Working configuration

1. BIOS (Dell, F12 → BIOS Setup → Security tab):

Enable both:

  • Thunderbolt Boot Support

  • Thunderbolt (and PCIe behind TBT) pre-boot modules

The second is critical — it executes PCIe Option ROMs during pre-boot so the BIOS enumerates the GPU at POST and allocates PCIe resources cleanly. Without this, no kernel configuration resolves the problem reliably.

2. Open kernel modules (required for Blackwell):

sudo apt install nvidia-driver-590-open

The closed variant installs but fails with modprobe: ERROR: could not insert 'nvidia': No such device — it doesn’t recognize 10de:2c02.

Enroll the DKMS key for Secure Boot:

sudo mokutil --import /var/lib/shim-signed/mok/MOK.der
sudo reboot
# enroll at blue MOK Manager screen

3. Kernel parameters:

Noting explicitly: issue #979 suggests pci=assign-busses,realloc. On this hardware, that configuration left the driver in the probe-retry loop. The HPE advisory a00151736en_us documents a different Ubuntu + NVIDIA interaction where pci=realloc causes the kernel to remove BIOS-assigned BAR ranges without reassigning them, preventing driver communication. pci=realloc=off resolved probe on this hardware.

/etc/default/grub:

GRUB_CMDLINE_LINUX_DEFAULT="quiet splash pci=realloc=off pcie_aspm=off pcie_ports=native pcie_port_pm=off thunderbolt.clx=0 thunderbolt.host_reset=0 iommu=pt"

Then sudo update-grub.

Parameters:

  • pci=realloc=off — preserves BIOS BAR allocation (HPE advisory)

  • pcie_aspm=off — no ASPM L0s/L1 on TB-tunneled link

  • pcie_ports=native — kernel-native PCIe port management for D-state handling

  • pcie_port_pm=off — no D3cold entry on upstream bridges

  • thunderbolt.clx=0 — no TB CLx link power states

  • thunderbolt.host_reset=0 — no TB host controller reset cascade

  • iommu=pt — IOMMU passthrough

4. Module parameters:

/etc/modprobe.d/nvidia-egpu.conf:

options nvidia NVreg_DynamicPowerManagement=0x00
options nvidia NVreg_PreserveVideoMemoryAllocations=0

NVreg_DynamicPowerManagement=0x00 is particularly important — D3cold transition failures over Thunderbolt manifest as “fallen off the bus.” Disabling runtime PM prevents the GPU from attempting to enter D3cold.

Then sudo update-initramfs -u.

5. Boot procedure:

Cold boot only. Fully shut down, ensure eGPU is powered on and connected, then power on the laptop. With pre-boot TB (step 1), the BIOS enumerates the GPU at POST.

Hot-plug via egpu-init style scripts remains unreliable in my testing — the rapid remove/rescan/modprobe sequence runs before the Thunderbolt tunnel fully stabilizes.

6. Persistence:

sudo systemctl enable --now nvidia-persistenced

The Ubuntu-packaged unit file lacks an [Install] section; if enable fails, create a drop-in with WantedBy=multi-user.target.

Result

NVIDIA-SMI 590.48.01   Driver Version: 590.48.01   CUDA Version: 13.1
RTX 5080 | 16303 MiB | P8 | 29°C | 10W / 360W

Reproducible across cold boots.

Credits

  • hvico/Razer-Core-v2-Linux-Fix — baseline configuration (nouveau blacklist, egpu-init script, systemd units)

  • HPE advisory — the pci=realloc=off finding

Stability testing

Tested with PyTorch 2.11.0 + cu130 build:

Test 1 (tiny allocation)              — pass
Test 2 (100×100 matmul)               — pass
Test 3 (1000×1000 matmul)             — pass
Test 4 (10000×10000 matmul, ~800MB)   — pass

Sustained load test: 4275 iterations of 8000×8000 float32 matmul over 120 seconds (~36 TFLOPs sustained, ~64% of RTX 5080 peak FP32). No hard-lock, no thermal throttling observed, iteration rate stable throughout.