Can't install any nvidia driver for Quadro K3100M on Ubuntu 22

System:
Host: xxx Kernel: 6.5.0-15-generic x86_64 bits: 64 Desktop: GNOME 42.9
Distro: Ubuntu 22.04.3 LTS (Jammy Jellyfish)
Graphics:
Device-1: Intel 4th Gen Core Processor Integrated Graphics driver: i915
v: kernel
Device-2: NVIDIA GK104GLM [Quadro K3100M] driver: N/A
Device-3: Chicony HP HD Webcam type: USB driver: uvcvideo
Display: wayland server: X.Org v: 1.22.1.1 with: Xwayland v: 22.1.1
compositor: gnome-shell driver: X: loaded: modesetting,nvidia
unloaded: fbdev,nouveau,vesa gpu: i915 resolution: 1920x1080~60Hz
OpenGL: renderer: Mesa Intel HD Graphics 4600 (HSW GT2)
v: 4.6 Mesa 23.0.4-0ubuntu1~22.04.1

Problem:

System cleanly installs with xorg nouveau driver.

Other two options are
nvidia-driver-390
nvidia-driver-418-server

Both installations end up with error code.

What to do?

Please install the 470 driver using Software&Updates, this is the correct driver for your Kepler based GPU.

Thank you!

These are available options:

Not sure how to get to driver 470 from Software& Updates.

I tried like this:

$sudo apt install nvidia-driver-470

then:

$ nvidia-smi
Mon Feb  5 18:05:42 2024       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.223.02   Driver Version: 470.223.02   CUDA Version: 11.4     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Quadro K3100M       Off  | 00000000:01:00.0 Off |                  N/A |
| N/A   45C    P8     3W /  N/A |      5MiB /  4037MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A      1646      G   /usr/lib/xorg/Xorg                  2MiB |
+-----------------------------------------------------------------------------+

then:

$nvtop

with output:

nvtop window never shows any readings.

Does this look right?

How to test the GPU?

The nvidia gpu is in offload mode
https://http.download.nvidia.com/XFree86/Linux-x86_64/550.40.07/README/primerenderoffload.html
Try running
__NV_PRIME_RENDER_OFFLOAD=1 __GLX_VENDOR_LIBRARY_NAME=nvidia glxgears

Running synchronized to the vertical refresh.  The framerate should be
approximately the same as the monitor refresh rate.
302 frames in 5.0 seconds = 60.283 FPS
301 frames in 5.0 seconds = 60.007 FPS
301 frames in 5.0 seconds = 60.003 FPS

Watch nvidia-smi or nvtop while running, should show up.


This is the current status:

  • cogwheels are turning
  • first terminal window is showing ~60 FPS
  • nvidia-smi to me looks the same
  • nvtop shows some activity

After:

sudo apt install nvidia-cuda-toolkit

nvidia-smi returns:

Command 'nvidia-smi' not found, but can be installed with:
sudo apt install nvidia-utils-390         # version 390.157-0ubuntu0.22.04.2, or
sudo apt install nvidia-utils-418-server  # version 418.226.00-0ubuntu5~0.22.04.1
sudo apt install nvidia-utils-450-server  # version 450.248.02-0ubuntu0.22.04.1
sudo apt install nvidia-utils-470         # version 470.223.02-0ubuntu0.22.04.1
sudo apt install nvidia-utils-470-server  # version 470.223.02-0ubuntu0.22.04.1
sudo apt install nvidia-utils-525         # version 525.147.05-0ubuntu0.22.04.1
sudo apt install nvidia-utils-525-server  # version 525.147.05-0ubuntu0.22.04.1
sudo apt install nvidia-utils-535         # version 535.129.03-0ubuntu0.22.04.1
sudo apt install nvidia-utils-535-server  # version 535.129.03-0ubuntu0.22.04.1
sudo apt install nvidia-utils-510         # version 510.60.02-0ubuntu1
sudo apt install nvidia-utils-510-server  # version 510.47.03-0ubuntu3

nvtop returns: no GPU to monitor

but Software&Updates shows that a driver is installed:

image

after:

sudo apt install nvidia-utils-470

nvidia-smi and nvtop show data as before.

but:

python3 -c "import tensorflow as tf; print(tf.config.list_physical_devices('GPU'))"

returns:

024-02-05 23:10:40.063925: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-02-05 23:10:40.063972: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-02-05 23:10:40.064924: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-02-05 23:10:40.070361: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-02-05 23:10:40.798184: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
2024-02-05 23:10:41.390577: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:901] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2024-02-05 23:10:41.424517: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:901] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2024-02-05 23:10:41.425007: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:901] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2024-02-05 23:10:41.425333: I tensorflow/core/common_runtime/gpu/gpu_device.cc:2298] Ignoring visible gpu device (device: 0, name: Quadro K3100M, pci bus id: 0000:01:00.0, compute capability: 3.0) with Cuda compute capability 3.0. The minimum required Cuda capability is 3.5.

You can see in nvidia-smi output clearly that glxgears is running on the nvidia gpu.

1 Like

Yes, now I see it too. Thank you!

Any comment on the tensorflow issues or should I open another topic for that one?

The cuda capability (cc) of your gpu is 3.0 but the tensorflow version you’re using requires cc 3.5 minimum so it can’t use your gpu. So you will need an older tensorflow version or recompile it for cc 3.0.

1 Like

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.