RTX 3090 eGPU over TB3 randomly falls off bus on Ubuntu 22.04.5 / Dell XPS 13 9340

georg9alem · February 16, 2026, 10:25pm

I’ve been using an eGPU setup for a couple of years now (RTX 3090 in a Razer Core X Chroma over Thunderbolt) and only recently started getting random GPU disconnects mid-workload.

The failure is intermittent: sometimes after ~30 minutes, sometimes only after ~20 hours. It can happen under low VRAM usage too (not only high load).

Specs:

Laptop: Dell XPS 13 9340
OS: Ubuntu 22.04.5 LTS
Kernel: 6.5.0-45-generic
eGPU enclosure: Razer Core X Chroma (TB3)
GPU: NVIDIA RTX 3090
Driver stack currently installed: 580.126.09 (nvidia-driver-580-open)

Very recent changes that I can recall before the issue started

NVIDIA driver update (via unattended-upgrades):
- 580.95.05 → 580.126.09
Dell BIOS update:
- 1.21.0 → 1.23.0
I also cleaned laptop vents externally with compressed air (see below why I mention this).

Mid-workload the gpu apparently falls off the bus and I get “no devices were found”. Kernel logs repeatedly show this sequence:

Many corrected AER Data Link errors (BadDLLP) on pcieport 0000:02:01.0 (Intel JHL6540 TB3 bridge)
Then fatal AER (DLP) and link reset/recovery failure
Then NVIDIA:
- Xid 79, GPU has fallen off the bus
- Xid 154, recovery action … Node Reboot Required

I attached nvidia-bug-report logs (baseline and after drop). It appears to have something to do with the physical TB connection?

Has anyone seen this exact TB/eGPU pattern on 580.126.09 (open driver) on Ubuntu 22.04, and is there a recommended driver branch/workaround to test first?

P.S.

I have to use open drivers because I have a second egpu setup with a 5090 and afik the blackwell architecture doesn’t work without the open drivers.

georg9alem · February 16, 2026, 10:27pm

nvidia-bug-report.baseline.log (5.2 MB)

nvidia-bug-report.after-drop.log (5.6 MB)

morgwai666 · February 16, 2026, 11:42pm

There is a pretty good chance that your TB cable is dying after a few years: have you tried another?

georg9alem · February 17, 2026, 7:36am

Thanks for the follow-up.

Yep, cable is my first test. If it still persists, do you see another likely root cause (e.g., laptop TB controller/retimer path or ports)?

morgwai666 · February 17, 2026, 1:23pm

If it’s not the cable, then it may be literally any other component with generally similar probability: very hard to tell.

It’s rather not a software issue, because that would affect many ppl roughly at the same time and there was no sudden massive amount of failure reports on egpu.io etc. The only software component involved IMO may be this DELL firmware upgrade, because that may be affecting only your model for example, so you may check if downgrading back helps.

alexanderbrun · March 30, 2026, 10:11pm

Hey @georg9alem ,

were you able to find the Root cause of your issue?

I am using the Razer Core X with a 5060ti and Ubuntu 24.04 on a Minisforum UM790Pro and getting exact the Same AER Issues in my journalctl. I am also using the 580-Open driver.

sometimes the PC crashes - This could Happen After 2 mins and sometimes After Hours.

I am very interested in your Feedback what was your fix in the past or maybe the Razer Core X was just defect/broken.

Would be very happy for a reply :)

Thanks in Advance!

catt · March 31, 2026, 1:43am

You may want to check out the release notes on the latest driver, 595.45.04: https://www.nvidia.com/en-us/drivers/details/265870/
It sounds like the issue may be fixed?

If your issue still happens on 595.45.04, you’ll want to attach a bug report file for NVIDIA to review.

alexanderbrun · March 31, 2026, 10:24am

Hey @catt and thanks for your fast reply.
I’ve installed the last driver 595.58.03 from the Nvidia repos, but my error is still occurring.

So the issue of the AER and BadDLLP is still persisting.

But thanks for your help at all!

Mär 31 12:23:14 brun-ki01 kernel: pcieport 0000:00:04.1: AER: Correctable error message received from 0000:65:01.0

Mär 31 12:23:14 brun-ki01 kernel: pcieport 0000:65:01.0: PCIe Bus Error: severity=Correctable, type=Data Link Layer, (Receiver ID)

Mär 31 12:23:14 brun-ki01 kernel: pcieport 0000:65:01.0: device [8086:15da] error status/mask=00000080/00002000

Mär 31 12:23:14 brun-ki01 kernel: pcieport 0000:65:01.0: [ 7] BadDLLP

morgwai666 · April 1, 2026, 9:56am

Blackwell is known to be extremely unstable as a TB eGPU: see egpu.io forum for dozens of similar problem reports. This is most probably due to its low tolerance for PCIe signal latency.

Topic		Replies	Views
Driver Crash (Xid 79) - "GPU has fallen off the bus" with eGPU (Razer Core X) Linux debugging-and-troubleshooting , linux-driver	2	450	October 20, 2025
Ubuntu 20.04 RTX 3060 eGPU falls off the bus Linux	8	1602	March 4, 2022
Loading GSP firmware from an AMD Strix laptop to a TB5 3090 eGPU causes instant reboot Linux	2	194	April 28, 2026
Drivers problem with GeForce RTX 3090 (Razer Core X eGPU) Ubuntu 22.04 Linux	2	1344	January 4, 2023
MSI RTX3090 eGPU Ubuntu 22.04.4 Issues GPU - Hardware pcie , kernel , ubuntu	3	458	August 1, 2024
RTX 3090 in Razer Core X Randomly Crashes on Ubuntu 22.04 Linux cuda , linux	1	234	July 29, 2024
RTX 5090 not working as eGPU on ubuntu 22.04 Linux cuda , kernel	8	1156	March 15, 2026
RTX 5060 Ti eGPU (AORUS AI BOX) — CUDA hard-lock on Linux via Thunderbolt 4 Linux	3	470	April 2, 2026
GPU (4090) falls off the bus, Linux desktop General Topics & Other SDKs ubuntu , cudnn	2	829	June 19, 2024
Bug Report - 'GPU has fallen off the bus' randomly; NVIDIA GeForce RTX 4090 + NVIDIA GeForce RTX 5090 D dual setup Linux hw , ubuntu	6	500	March 26, 2026

RTX 3090 eGPU over TB3 randomly falls off bus on Ubuntu 22.04.5 / Dell XPS 13 9340

Related topics