Loading GSP firmware from an AMD Strix laptop to a TB5 3090 eGPU causes instant reboot

I have a TUXEDO IBP-14 gen10 AMD laptop (HX370 APU, based on TongFang X4SP4NAL) and an RTX 3090 that I’ve been successfully connecting using various eGPU adapters:

…and it has all been working perfectly stable on my Debian-13 system with the DC drivers versions 575, 580 (from the D-12 repo) and 590 (from the D-13 repo), both open and proprietary flavors, across several kernel versions (6.16.1-6.18.x).

Recently I’ve purchased a new TB5 adapter: Minisforum DEG2: it is quite new, but a few reviews it received from Windows users praised it for perfect stability and flawless work.
However when I connected my 3090 via my new DEG2 to my laptop, it abruptly rebooted about 0.5-1 second after the driver was loaded. I’ve tried all the mentioned above driver versions and even Nouveau, different distros (live Fedora-43+Nouveau, live POP_OS-24.04+v580), but the result was always the same (I even wanted to try Windows-11, but it’s uninstallable on this laptop currently).

After some hair-pulling and extensive testing, I’ve managed to notice, that the operation that causes the reboots is somehow related to GSP firmware: I’ve verified that the following configurations allow the eGPU to work mostly stable:

  • Nouveau driver without firmware-nvidia-graphics package (containing GSP v570.144)
  • proprietary driver v580 and v590 with options nvidia NVreg_EnableGpuFirmware=0 (the open flavor still reboots even with this param set to 0)

Here is the reference bug report file when it works fine with proprietary v590 and no GSP firmware:
nvidia-bug-report-deg2-v590prop-noGsp.log.gz (1.1 MB)

It is impossible to get a bug report file with GSP firmware present due to the almost immediate reboot. However, if I remount my filesystem into sync mode and continuously stream /dev/kmsg to a file there (using dmesg --follow), then sometimes (like about every 3rd attempt) a split-second before a reboot the following amdgpu entries are caught:

[  203.722070] amdgpu 0000:65:00.0: amdgpu: Fence fallback timer expired on ring gfx_0.0.0
[  203.722081] amdgpu 0000:65:00.0: amdgpu: Fence fallback timer expired on ring sdma0
[  204.262063] amdgpu 0000:65:00.0: amdgpu: Fence fallback timer expired on ring comp_1.2.0

Here is the full /dev/kmsg contents from a “GSP boot”:
dmesg-deg2-x4sp4nal-debian13-6.18.5-v580prop-gsp.log (220.5 KB)

Not sure if amdgpu is to blame here or if it is just a victim of a general f*ck-up on the PCIe bus, but I will post it to their issue tracker just in case and will provide a link here later.
UPDATE: Making sure you're not a bot!

Corresponding egpu.io thread: WIP: 2025 14″ TUXEDO InfinityBook Pro 14 Gen10 (890M) [RAI3,12C,HX] + RTX 3090 @ 64Gbps-USB4v1>TB5 (Minisforum DEG2) + Linux Debian trixie // loading NV GSP firmware reboots the laptop, occasional falling off the bus, daisy-chaining fails consistent… | Work-In-Progress Builds

I will also post to linux-usb in case it’s a problem in the Linux thunderbolt module.

kernel.org bugzilla entry: (CCed to linux-usb automatically)
https://bugzilla.kernel.org/show_bug.cgi?id=221319

The problem is being actively discussed on the Framework forum: Help wanted - eGPU bootloop issues - Framework Laptop 16 - Framework Community

I’ve also asked Tuxedo for assistance: reporting TB5 problems involving IBP-14 gen10 AMD to linux-usb (#186) · Issues · TUXEDO Computers / Development / Packages / linux · GitLab

…and Minisforum (who are in the best position to debug this, according to Mario Limonciello from AMD), but their reply is just sad:

Dear Customer,

Thank you for your patience.

After further discussion regarding your issue, we sincerely apologize that we are unable to conduct accurate debugging on Linux systems due to technical limitations. This problem is likely an isolated compatibility issue specific to Linux. We recommend installing Windows 11 for testing to verify if the random restart still occurs.

Best regards,