Hey everyone, I’m having pretty severe issues with my RTX 3090 eGPU setup and am at my wits end.
My setup:
- Framework Laptop 13 w/ Ryzen 7640u (BIOS version 03.05)
- 96GB DDR5-RAM
- Ubuntu 22.04 (Wayland) and 6.5.0-1025-oem kernel
- Nvidia RTX 3090 in Razer Core X eGPU enclosure
The issue :
My eGPU regularly crashes (mostly under high load) forcing me to restart my laptop. Here’s what I’ve observed:
- The GPU mostly crashes during high load, for instance when running Stable Diffusion or an LLM. This can happen even after just 2 minutes of use.
- Other times, the 3090 runs fine for 20+ minutes at almost 100% load without issues.
- Occasionally, the GPU crashes even when there’s barely any load.
- When the crash occurs my whole system often becomes unresponsive as well.
What I’ve tried so far:
- Switch to open-kernel Nvidia drivers
- Limit the clockspeed of the 3090 with nvidia-smi -lgc 300,1750
- Blacklist the Nouveau drivers
- Add pcie_aspm=off to my kernel parameters.
I’ve attached two log files:
- https://pastebin.com/miKexVUJ: Contains journalctl, dmseg and other logs collected when the eGPU was working properly.
- https://pastebin.com/c8i1wiyW: Contains the same logs collected immediately after the eGPU crash.
I’m at my wits end and will have to sell the Razer Core X and the 3090 if I can’t fix this, but I’d much rather keep them, so I would be incredibly grateful if someone could help me :(