Hello!
I experience problems trying to set up eGPU video card on Ubuntu 24.04.
My goal is to enable hot-plugging of eGPU, however, no matter what I do it always fails. In the beginning I had PCI error indicating that there are no free addresses, after tuning the configs my eGPU is recognized by OS, visible in lspci, but nvidia-smi still fails to detect the card. Since I do not know how to proceed further I ask for a help there :)
The things I have done so far:
Enabled ‘AllowExternalGpus’ option.
Tweaked kernel parameters to: quiet splash pci=realloc,assign-busses,hpbussize=0x10,hpmmiosize=32M,hpmmioprefsize=256M
Played with BIOS parameters (disabled security and secureboot).
Currently whenever I plug the eGPU, I see that it is sometimes correctly recognized and addressed, visible by lspci:
sashamikoff@sashamikoff-ThinkPad-T480s:~$ lspci | grep -i nvi
01:00.0 3D controller: NVIDIA Corporation GP108M [GeForce MX150] (rev a1)
0a:00.0 VGA compatible controller: NVIDIA Corporation TU106 [GeForce RTX 2070] (rev a1)
0a:00.1 Audio device: NVIDIA Corporation TU106 High Definition Audio Controller (rev a1)
0a:00.2 USB controller: NVIDIA Corporation TU106 USB 3.1 Host Controller (rev a1)
0a:00.3 Serial bus controller: NVIDIA Corporation TU106 USB Type-C UCSI Controller (rev a1)
but it is never recongized by nvidia-smi.
If I restart the machine with the eGPU plugged in, it is visible to nvidia-smi and I can use it.
[ 375.608912] pci 0000:0a:00.0: BAR 1 [mem size 0x10000000 64bit pref]: can't assign; no space
[ 375.608915] pci 0000:0a:00.0: BAR 1 [mem size 0x10000000 64bit pref]: failed to assign
[ 375.608917] pci 0000:0a:00.0: BAR 3 [mem size 0x02000000 64bit pref]: can't assign; no space
[ 375.608919] pci 0000:0a:00.0: BAR 3 [mem size 0x02000000 64bit pref]: failed to assign
[ 375.608922] pci 0000:0a:00.0: BAR 0 [mem 0xc5000000-0xc5ffffff]: assigned
[ 375.608933] pci 0000:0a:00.0: ROM [mem 0xc4800000-0xc487ffff pref]: assigned
[ 375.608936] pci 0000:0a:00.2: BAR 0 [mem 0xc4880000-0xc48bffff 64bit pref]: assigned
[ 375.608969] pci 0000:0a:00.2: BAR 3 [mem 0xc48c0000-0xc48cffff 64bit pref]: assigned
[ 375.609002] pci 0000:0a:00.1: BAR 0 [mem 0xc48d0000-0xc48d3fff]: assigned
[ 375.609014] pci 0000:0a:00.3: BAR 0 [mem 0xc48d4000-0xc48d4fff]: assigned
[ 375.609031] pci 0000:0a:00.0: BAR 5 [io size 0x0080]: can't assign; no space
[ 375.609034] pci 0000:0a:00.0: BAR 5 [io size 0x0080]: failed to assign
The laptop’s bios only has 32bit resources enabled, not sufficient for a third gpu. You might check if you can disable the internal mx150 completely tofree up address space.
Ok, after playing a little bit around it and attaching my eGPU not through the docking station, but directly to a laptop I am able to see the card. There are now any errors in all the logs.
The card, however, is still absent from nvidia-smi.