Hi,
I’m trying to install CUDA through the .deb on Mac Pro Late 2013 running Ubuntu 18.04 to do some ML. I have a RTX 2080 Ti in a Razer Core X which is recognized by my system but it seems that the driver is not loading.
After following the installation guide for Linux, the command
nvidia-smi
returns
NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.
and
~$ nvidia-settings
returns
ERROR: NVIDIA driver is not loaded
ERROR: Unable to load info from any available system
Eventhough,
~$ lspci | grep -i nvidia
19:00.0 VGA compatible controller: NVIDIA Corporation TU102 [GeForce RTX 2080 Ti Rev. A] (rev a1)
Looking at the bug report I also found the following.
dmesg
[ 378.103711] NVRM: This PCI I/O region assigned to your NVIDIA device is invalid:
NVRM: BAR1 is 0M @ 0x0 (PCI:0000:19:00.0)
[ 378.103714] NVRM: The system BIOS may have misconfigured your GPU.
[ 378.103726] nvidia: probe of 0000:19:00.0 failed with error -1
[ 378.103785] NVRM: The NVIDIA probe routine failed for 1 device(s).
[ 378.103787] NVRM: None of the NVIDIA devices were initialized.
[ 378.104219] nvidia-nvlink: Unregistered the Nvlink Core, major device number 237
[ 378.806726] nvidia-nvlink: Nvlink Core is being initialized, major device number 237
[ 378.807642] NVRM: This PCI I/O region assigned to your NVIDIA device is invalid:
NVRM: BAR1 is 0M @ 0x0 (PCI:0000:19:00.0)
[ 378.807645] NVRM: The system BIOS may have misconfigured your GPU.
[ 378.807653] nvidia: probe of 0000:19:00.0 failed with error -1
[ 378.807699] NVRM: The NVIDIA probe routine failed for 1 device(s).
[ 378.807700] NVRM: None of the NVIDIA devices were initialized.
[ 378.807984] nvidia-nvlink: Unregistered the Nvlink Core, major device number 237
[ 379.506203] nvidia-nvlink: Nvlink Core is being initialized, major device number 237
[ 379.507025] NVRM: This PCI I/O region assigned to your NVIDIA device is invalid:
NVRM: BAR1 is 0M @ 0x0 (PCI:0000:19:00.0)
[ 379.507026] NVRM: The system BIOS may have misconfigured your GPU.
[ 379.507033] nvidia: probe of 0000:19:00.0 failed with error -1
[ 379.507076] NVRM: The NVIDIA probe routine failed for 1 device(s).
[ 379.507077] NVRM: None of the NVIDIA devices were initialized.
...
and
/usr/bin/lspci -d "10de:*" -v -xxx
19:00.0 VGA compatible controller: NVIDIA Corporation TU102 [GeForce RTX 2080 Ti Rev. A] (rev a1) (prog-if 00 [VGA controller])
Subsystem: eVga.com. Corp. Device 2484
Flags: fast devsel, IRQ 27
Memory at a1000000 (32-bit, non-prefetchable)
Memory at <ignored> (64-bit, prefetchable)
Memory at c0000000 (64-bit, prefetchable)
I/O ports at 5000
Expansion ROM at a2000000 [disabled]
Capabilities: [60] Power Management version 3
Capabilities: [68] MSI: Enable- Count=1/1 Maskable- 64bit+
Capabilities: [78] Express Legacy Endpoint, MSI 00
Capabilities: [100] Virtual Channel
Capabilities: [250] Latency Tolerance Reporting
Capabilities: [258] L1 PM Substates
Capabilities: [128] Power Budgeting <?>
Capabilities: [420] Advanced Error Reporting
Capabilities: [600] Vendor Specific Information: ID=0001 Rev=1 Len=024 <?>
Capabilities: [900] #19
Capabilities: [bb0] #15
Kernel modules: nvidiafb, nouveau, nvidia_drm, nvidia
Obviously, something’s going wrong with memory… but I don’t know what or how to fix it.
The mac pro has 16Gb RAM. Probably it’s worth mentioning.
Any help would be very much appreciated!
Thank you.
nvidia-bug-report.log.gz (1.06 MB)