Driver not loading. Misconfigured GPU. Ubuntu 18.04

Hi,

I’m trying to install CUDA through the .deb on Mac Pro Late 2013 running Ubuntu 18.04 to do some ML. I have a RTX 2080 Ti in a Razer Core X which is recognized by my system but it seems that the driver is not loading.

After following the installation guide for Linux, the command

nvidia-smi

returns

NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.

and

~$ nvidia-settings

returns

ERROR: NVIDIA driver is not loaded

ERROR: Unable to load info from any available system

Eventhough,

~$ lspci | grep -i nvidia
19:00.0 VGA compatible controller: NVIDIA Corporation TU102 [GeForce RTX 2080 Ti Rev. A] (rev a1)

Looking at the bug report I also found the following.

dmesg
[  378.103711] NVRM: This PCI I/O region assigned to your NVIDIA device is invalid:
               NVRM: BAR1 is 0M @ 0x0 (PCI:0000:19:00.0)
[  378.103714] NVRM: The system BIOS may have misconfigured your GPU.
[  378.103726] nvidia: probe of 0000:19:00.0 failed with error -1
[  378.103785] NVRM: The NVIDIA probe routine failed for 1 device(s).
[  378.103787] NVRM: None of the NVIDIA devices were initialized.
[  378.104219] nvidia-nvlink: Unregistered the Nvlink Core, major device number 237
[  378.806726] nvidia-nvlink: Nvlink Core is being initialized, major device number 237
[  378.807642] NVRM: This PCI I/O region assigned to your NVIDIA device is invalid:
               NVRM: BAR1 is 0M @ 0x0 (PCI:0000:19:00.0)
[  378.807645] NVRM: The system BIOS may have misconfigured your GPU.
[  378.807653] nvidia: probe of 0000:19:00.0 failed with error -1
[  378.807699] NVRM: The NVIDIA probe routine failed for 1 device(s).
[  378.807700] NVRM: None of the NVIDIA devices were initialized.
[  378.807984] nvidia-nvlink: Unregistered the Nvlink Core, major device number 237
[  379.506203] nvidia-nvlink: Nvlink Core is being initialized, major device number 237
[  379.507025] NVRM: This PCI I/O region assigned to your NVIDIA device is invalid:
               NVRM: BAR1 is 0M @ 0x0 (PCI:0000:19:00.0)
[  379.507026] NVRM: The system BIOS may have misconfigured your GPU.
[  379.507033] nvidia: probe of 0000:19:00.0 failed with error -1
[  379.507076] NVRM: The NVIDIA probe routine failed for 1 device(s).
[  379.507077] NVRM: None of the NVIDIA devices were initialized.
...

and

/usr/bin/lspci -d "10de:*" -v -xxx

19:00.0 VGA compatible controller: NVIDIA Corporation TU102 [GeForce RTX 2080 Ti Rev. A] (rev a1) (prog-if 00 [VGA controller])
	Subsystem: eVga.com. Corp. Device 2484
	Flags: fast devsel, IRQ 27
	Memory at a1000000 (32-bit, non-prefetchable) 
	Memory at <ignored> (64-bit, prefetchable)
	Memory at c0000000 (64-bit, prefetchable) 
	I/O ports at 5000 
	Expansion ROM at a2000000 [disabled] 
	Capabilities: [60] Power Management version 3
	Capabilities: [68] MSI: Enable- Count=1/1 Maskable- 64bit+
	Capabilities: [78] Express Legacy Endpoint, MSI 00
	Capabilities: [100] Virtual Channel
	Capabilities: [250] Latency Tolerance Reporting
	Capabilities: [258] L1 PM Substates
	Capabilities: [128] Power Budgeting <?>
	Capabilities: [420] Advanced Error Reporting
	Capabilities: [600] Vendor Specific Information: ID=0001 Rev=1 Len=024 <?>
	Capabilities: [900] #19
	Capabilities: [bb0] #15
	Kernel modules: nvidiafb, nouveau, nvidia_drm, nvidia

Obviously, something’s going wrong with memory… but I don’t know what or how to fix it.

The mac pro has 16Gb RAM. Probably it’s worth mentioning.

Any help would be very much appreciated!
Thank you.
nvidia-bug-report.log.gz (1.06 MB)

Hi, antoinedaurat, are you installing Versión 10.2 of CUDA. I encountered the similar problem with RHEL 7.7, so I just removed the Nvidia driver part by yum remove nvidia-driver, and then reissue the cuda installation command, ie yum install cuda, which install and properly configured the driver for the GPU.