Installation problem GeForce1080Ti

Hello,

I am trying to install a secondary monitor on a workstation with Asus motherboard+ Sonnet Breakway Box + geForce 1080Ti.

The driver is installed but I do not get any video signal on the secondary monitor.

To install I have followed following steps:

sudo apt-get remove --purge ‘^nvidia-.*’
sudo ubuntu-drivers autoinstall

following is the output of nvidia-smi

±----------------------------------------------------------------------------+
| NVIDIA-SMI 470.103.01 Driver Version: 510.60.02 CUDA Version: 11.6 |
|-------------------------------±---------------------±---------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA GeForce … Off | 00000000:3C:00.0 Off | N/A |
| 28% 24C P8 8W / 250W | 92MiB / 11264MiB | 0% Default |
| | | N/A |
±------------------------------±---------------------±---------------------+

±----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| No running processes found |

Following the the output of nvidia-setting

ERROR: Unable to load info from any available system

(nvidia-settings:29183): GLib-GObject-CRITICAL **: 08:50:47.946: g_object_unref: assertion ‘G_IS_OBJECT (object)’ failed
** Message: 08:50:47.949: PRIME: No offloading required. Abort
** Message: 08:50:47.949: PRIME: is it supported? no

tried
sudo prime-select nvidia

but it did not change the situation.

The bug report is the following

nvidia-bug-report.log.gz (269.8 KB)

Nvidia X server does not launch correctly (empty). Installed also

sudo apt install nvidia-cuda-toolkit

with no effect.

dkms staus does not return anything.

Any idea how to solve this? I have also tried to install from nvidia run files and disable secure boot.

thank you!
vittorio

Module “nvidia” not found

The kernel driver is running but the Xorg components seem to be either missing or a path is unset.
Please delete /etc/X11/xorg.conf, reboot and create a new nvidia-bug-report.log

In the meantime I had to change the driver to 460 to work with Cuda 11.2 for TensorFlow downgrading the kernel to 5.4. TensorFlow works fine with the GPU but the display problem is still there.
new output from nvidia-smi is following
±----------------------------------------------------------------------------+
| NVIDIA-SMI 460.32.03 Driver Version: 460.32.03 CUDA Version: 11.2 |
|-------------------------------±---------------------±---------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 GeForce GTX 108… On | 00000000:3C:00.0 Off | N/A |
| 28% 24C P8 8W / 250W | 0MiB / 11178MiB | 0% Default |
| | | N/A |
±------------------------------±---------------------±---------------------+

±----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| No running processes found |
±----------------------------------------------------------------------------+

new bug report

nvidia-bug-report.log.gz (336.5 KB)

The file that you mention is not present in my system, but there is a directory

/usr/share/X11/xorg.conf.d

shall i delete it?

thank you!

Please remove nomodeset kernel parameter.

If I remove nomodeset both displays do not work.

The MB I believe is not supported by the the kernel 5.4. I had to install the drivers for the Ethernet adapter separately. tried also with/without nouveau.modeset = 0, in both cases both displays do not work.

I want to use only the display connected to the Nvidia GPU, I am just using the one connected to the MB for the setup.

You have an intel 10th gen cpu which is supported by the 5.4 GA kernel. Instead of manually installing an ethernet driver, why didn’t you just use the 5.13 HWE kernel?

I thought I needed 5.4 it to install CUDA 11.2. The .deb package for CUDA was forcing the installation the 460 driver that is not compatible with 5.13.
But today I have started from scratch again, formatted the disk, installed a fresh copy of Ubuntu with the 5.13 kernel, the driver nvidia 520 and then 11.2 using the .run file that allows me to deselect the installation of 460. TensorsFlow is loading the GPU properly in this configuration too.

5.4 supports the CPU, but was not installing the audio codec and other drivers like the ethernet and the Intel Graphic HD 630. That is why I need to boot with’ nomodoset’ I think.
With 5.13, the driver of the internal graphic card is installed (i915). So no more ‘nomodoset’ in the boot.

But also in this configuration the display connected to the Nvidia GPU is still not working.
I can use the system now, with the display connected to Intel GPU and using the Nvidia GPU only for TensorFlow. But it bugs me a little of why it is not working. At the end is a GPU and the display should be its main function.

Here following the new bug report. I will not change the configuration for the time being.

nvidia-bug-report.log.gz (305.1 KB)

p.s. during the various attempts on the 5.4 kernel, after installing the HD 630 driver, I had the splash coming up on the main display and the login on the nvidia GPU display only. The login was looping, but I could login changing to Wayland. I do know if this help you in anyway.

Please see this:
https://forums.developer.nvidia.com/t/linux-mint-nvidia-driver-loads-with-startx-but-not-on-initial-startup/168262/2

I get an error message about no GPU found for the 3 commands before GUI is launched.

I have tried also to activate the persistence demon with the command

$ sudo nvidia-persistenced --user xxxx

but it fails to initialize and syslog is the following

May 8 03:05:05 UB002 nvidia-persistenced: Received signal 15
May 8 03:05:05 UB002 update-notifier[1778]: update-notifier: Fatal IO error 11 (Resource temporarily unavailable) on X server :0.
May 8 03:05:05 UB002 nvidia-persistenced: Socket closed.
May 8 03:05:05 UB002 systemd[1]: Stopping Tool to automatically collect and submit kernel crash signatures…
May 8 03:05:05 UB002 nvidia-persistenced: PID file unlocked.
May 8 03:05:05 UB002 systemd[1]: Stopping Dispatcher daemon for systemd-networkd…
May 8 03:05:05 UB002 nvidia-persistenced: PID file closed.

with

sudo nvidia-smi -pm 1

it goes on, but of course not helpful at reboot.
I have just noticed that the display nvidia shows in the settings\dispalys menu, but if I try set ‘Join Display’ or ‘Single Display’ does not work.

The GPU is connected with a thunderbolt PCI card (ThunderboltEX3-TR from asus). Maybe this is the problem. somehow the nvidia driver tries to load before the thunderbolt connection is available?

Drivers will only load when the corresponding device is detected.
Please check for a blacklist file.
eGPU have to be enabled for Xorg usage
https://forums.developer.nvidia.com/t/black-screen-when-nvidia-connected-software-and-updates-dont-see-card-anymore-gdc-used-gt-1030-fan-running/213510/4?u=generix

I have checked for the blaklist files in modprobe.d . There are this ones:
alsa-base.conf blacklist-modem.conf
amd64-microcode-blacklist.conf blacklist-oss.conf
blacklist-ath_pci.conf blacklist-rare-network.conf
blacklist.conf dkms.conf
blacklist-firewire.conf intel-microcode-blacklist.conf
blacklist-framebuffer.conf iwlwifi.conf

in the blacklist.conf file I could not find anything related to nvidia.

The directory /etc/X11/xorg.conf.d was not there. I have created it and the file too. Tried both options, but no changes to the displays behavior.

here below is the content of the GPU-manager log

log_file: /var/log/gpu-manager.log
last_boot_file: /var/lib/ubuntu-drivers-common/last_gfx_boot
new_boot_file: /var/lib/ubuntu-drivers-common/last_gfx_boot
can’t access /run/u-d-c-nvidia-was-loaded file
can’t access /opt/amdgpu-pro/bin/amdgpu-pro-px
Looking for nvidia modules in /lib/modules/5.13.0-40-generic/kernel
Looking for nvidia modules in /lib/modules/5.13.0-40-generic/updates/dkms
Found nvidia.ko module in /lib/modules/5.13.0-40-generic/updates/dkms/nvidia.ko
Looking for amdgpu modules in /lib/modules/5.13.0-40-generic/kernel
Looking for amdgpu modules in /lib/modules/5.13.0-40-generic/updates/dkms
Is nvidia loaded? no
Was nvidia unloaded? no
Is nvidia blacklisted? no
Is intel loaded? yes
Is radeon loaded? no
Is radeon blacklisted? no
Is amdgpu loaded? no
Is amdgpu blacklisted? no
Is amdgpu versioned? no
Is amdgpu pro stack? no
Is nouveau loaded? no
Is nouveau blacklisted? yes
Is nvidia kernel module available? yes
Is amdgpu kernel module available? no
Vendor/Device Id: 8086:9bc8
BusID “PCI:0@0:2:0”
Is boot vga? yes
Skipping “/dev/dri/card0”, driven by “i915”
Skipping “/dev/dri/card0”, driven by “i915”
Skipping “/dev/dri/card0”, driven by “i915”
Found “/dev/dri/card0”, driven by “i915”
output 0:
card0-HDMI-A-1
Number of connected outputs for /dev/dri/card0: 1
Does it require offloading? yes
last cards number = 1
Has amd? no
Has intel? yes
Has nvidia? no
How many cards? 1

nvidia-bug-report.log.gz (311.8 KB)

blacklist.conf (1.5 KB)

I have renamed the file
/lib/modporobe.d/nvidia-grphic-drivers.conf
to
/lib/modporobe.d/nvidia-grphic-drivers.old

On reboot I can see both displays, but nvidia-smi does not detect the driver.

If I reboot in recovery mode the nvidia display comes on and the driver is detected by nvidia-smi.

nvidia-bug-report.log.gz (324.3 KB)

here following the GPU-manager.log

log_file: /var/log/gpu-manager.log
last_boot_file: /var/lib/ubuntu-drivers-common/last_gfx_boot
new_boot_file: /var/lib/ubuntu-drivers-common/last_gfx_boot
can’t access /run/u-d-c-nvidia-was-loaded file
can’t access /opt/amdgpu-pro/bin/amdgpu-pro-px
Looking for nvidia modules in /lib/modules/5.13.0-40-generic/kernel
Looking for nvidia modules in /lib/modules/5.13.0-40-generic/updates/dkms
Found nvidia.ko module in /lib/modules/5.13.0-40-generic/updates/dkms/nvidia.ko
Looking for amdgpu modules in /lib/modules/5.13.0-40-generic/kernel
Looking for amdgpu modules in /lib/modules/5.13.0-40-generic/updates/dkms
Is nvidia loaded? no
Was nvidia unloaded? no
Is nvidia blacklisted? no
Is intel loaded? yes
Is radeon loaded? no
Is radeon blacklisted? no
Is amdgpu loaded? no
Is amdgpu blacklisted? no
Is amdgpu versioned? no
Is amdgpu pro stack? no
Is nouveau loaded? no
Is nouveau blacklisted? no
Is nvidia kernel module available? yes
Is amdgpu kernel module available? no
Vendor/Device Id: 8086:9bc8
BusID “PCI:0@0:2:0”
Is boot vga? yes
Error: can’t access /sys/bus/pci/devices/0000:00:02.0/driver
The device is not bound to any driver.
Error : Failed to open /dev/dri
Error : Failed to open /dev/dri
Error : Failed to open /dev/dri
Error : Failed to open /dev/dri
Does it require offloading? no
last cards number = 1
Has amd? no
Has intel? yes
Has nvidia? no
How many cards? 1
Has the system changed? No
Single card detected
Nothing to do