Driver for RTX3070 not working under Elementary OS on MacBook Pro with eGPU

Dear Nvidia-team,

I have a MacbookPro 2015 with ElementaryOS on an thunderbolt jetdrive and a Razor Core X with RTX3070 installed on another thunderbolt port. I want to use a 5k Monitor via DP->thunderbolt cable. I tried a lot to install the latest NVIDIA driver 455 using the NVIDIA installer script, apt, and also Ubuntu ‘additional driver’ software. Still the driver doesn’t work.

After reboot when the eGPU is connected from the beginning to the thunderbolt port the internal screen stays black. When booting without the eGPU connected, I can boot normally. When connecting the eGPU I get:

dmesg | grep -i nvidia
[ 36.687431] nvidiafb 0000:3c:00.0: enabling device (0000 -> 0003)
[ 36.687524] nvidiafb: Device ID: 10de2484
[ 36.687525] nvidiafb: unknown NV_ARCH
[ 36.978490] nvidia: loading out-of-tree module taints kernel.
[ 36.978527] nvidia: module license ‘NVIDIA’ taints kernel.
[ 36.986017] nvidia: module verification failed: signature and/or required key missing - tainting kernel
[ 36.992981] nvidia-nvlink: Nvlink Core is being initialized, major device number 236
[ 36.993567] NVRM: This PCI I/O region assigned to your NVIDIA device is invalid:
[ 36.993710] nvidia: probe of 0000:3c:00.0 failed with error -1
[ 36.993724] NVRM: The NVIDIA probe routine failed for 1 device(s).
[ 36.993725] NVRM: None of the NVIDIA devices were initialized.
[ 36.994012] nvidia-nvlink: Unregistered the Nvlink Core, major device number 236
[ 37.362603] nvidia-nvlink: Nvlink Core is being initialized, major device number 236
[ 37.363441] NVRM: This PCI I/O region assigned to your NVIDIA device is invalid:
[ 37.363581] nvidia: probe of 0000:3c:00.0 failed with error -1

and
lspci -vv | grep -i -A11 nvidia
3c:00.0 VGA compatible controller: NVIDIA Corporation Device 2484 (rev a1) (prog-if 00 [VGA controller])
Subsystem: Gigabyte Technology Co., Ltd Device 404c
Control: I/O+ Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- SERR- <PERR- INTx-
Interrupt: pin A routed to IRQ 19
Region 0: Memory at a5000000 (32-bit, non-prefetchable) [size=16M]
Region 1: Memory at (64-bit, prefetchable)
Region 3: Memory at (64-bit, prefetchable)
Region 5: I/O ports at 5000 [size=128]
[virtual] Expansion ROM at a6000000 [disabled] [size=512K]
Capabilities:
Kernel modules: nvidiafb, nouveau, nvidia_drm, nvidia

3c:00.1 Audio device: NVIDIA Corporation Device 228b (rev a1)
Subsystem: Gigabyte Technology Co., Ltd Device 404c
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- SERR- <PERR- INTx-
Latency: 0
Interrupt: pin B routed to IRQ 16
Region 0: Memory at a6080000 (32-bit, non-prefetchable) [size=16K]
Capabilities:
Kernel driver in use: snd_hda_intel
Kernel modules: snd_hda_intel

I have followed some webpages to get rid of the nouveau, but without success.
Hopefully the bug report helps you understand what is the problem.

All the best,

Florian

nvidia-bug-report.log.gz (607.6 KB)

nvidia-installer.log (28.7 KB)

You’ll have to blacklist nvidiafb, it’s blocking the nvidia driver from working.

Thank you for your quick response. I blacklisted nvidiafb and (tried to) purged everything nvidia related and rerun the NVIDIA-Linux-x86_64-455.45.01.run. Same problem.
Again I attached the installer.log and the bug report.

‘lspci -vv | grep -i -A11 nvidia’ gives
3c:00.0 VGA compatible controller: NVIDIA Corporation Device 2484 (rev a1) (prog-if 00 [VGA controller])
Subsystem: Gigabyte Technology Co., Ltd Device 404c
Control: I/O+ Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- SERR- <PERR- INTx-
Interrupt: pin A routed to IRQ 19
Region 0: Memory at a5000000 (32-bit, non-prefetchable) [size=16M]
Region 1: Memory at (64-bit, prefetchable)
Region 3: Memory at (64-bit, prefetchable)
Region 5: I/O ports at 5000 [size=128]
[virtual] Expansion ROM at a6000000 [disabled] [size=512K]
Capabilities:
Kernel modules: nvidiafb, nouveau, nvidia_drm, nvidia

3c:00.1 Audio device: NVIDIA Corporation Device 228b (rev a1)
Subsystem: Gigabyte Technology Co., Ltd Device 404c
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- SERR- <PERR- INTx-
Latency: 0
Interrupt: pin B routed to IRQ 16
Region 0: Memory at a6080000 (32-bit, non-prefetchable) [size=16K]
Capabilities:
Kernel driver in use: snd_hda_intel
Kernel modules: snd_hda_intel

Somehow whatever I try to blacklist, somehow I can’t get rid of nvidiafb and nouveau.
In /etc/modprobe.d I have :
blacklist-framebuffer.conf
'# Framebuffer drivers are generally buggy and poorly-supported, and cause
'# suspend failures, kernel panics and general mayhem. For this reason we
'# never load them automatically.
blacklist aty128fb
blacklist atyfb
blacklist radeonfb
blacklist cirrusfb
blacklist cyber2000fb
blacklist cyblafb
blacklist gx1fb
blacklist hgafb
blacklist i810fb
blacklist intelfb
blacklist kyrofb
blacklist lxfb
blacklist matroxfb_base
blacklist neofb
blacklist nvidiafb
blacklist pm2fb
blacklist rivafb
blacklist s1d13xxxfb
blacklist savagefb
blacklist sisfb
blacklist sstfb
blacklist tdfxfb
blacklist tridentfb
blacklist vesafb
blacklist vfb
blacklist viafb
blacklist vt8623fb
blacklist udlfb

blacklist-nouveau.conf:
blacklist nouveau
blacklist lbm-nouveau
options nouveau modeset=0
alias nouveau off
alias lbm-nouveau off

blacklist-nvidia-nouveau.conf:
blacklist nouveau
options nouveau modeset=0

nouveau-kms.conf :
options nouveau modeset=0

in /lib/modprobe.d/ I have
fbdev-blacklist.conf:
'# This file blacklists most old-style PCI framebuffer drivers.

blacklist arkfb
blacklist aty128fb
blacklist atyfb
blacklist radeonfb
blacklist cirrusfb
blacklist cyber2000fb
blacklist kyrofb
blacklist matroxfb_base
blacklist mb862xxfb
blacklist neofb
blacklist pm2fb
blacklist pm3fb
blacklist s3fb
blacklist savagefb
blacklist sisfb
blacklist tdfxfb
blacklist tridentfb
blacklist vt8623fb
blacklist nvidiafb

and nvidia-graphics-drivers.conf:
blacklist nouveau
blacklist lbm-nouveau

and have run ‘sudo update-initramfs -u’ several times.

What could I do?
All the best,

Florian

nvidia-bug-report.log.gz (555.5 KB)
nvidia-installer.log (28.9 KB)

Now you’re running into

NVRM: This PCI I/O region assigned to your NVIDIA device is invalid:
                                                 NVRM: BAR1 is 0M @ 0x0 (PCI:0000:3c:00.0)

which is very common on Apple hardware. Please try using the kernel parameter
pci=realloc
If that doesn’t help, you’ll need to remove and add the gpu again before loading the driver, see this for some hints:
https://github.com/Dunedan/mbp-2016-linux/issues/60