Hello! My system:
admin2@admin2:~$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 18.04.5 LTS
Release: 18.04
Codename: bionic
It was 6 cards on server
admin2@admin2:~$ lspci | grep -i nvidia
01:00.0 VGA compatible controller: NVIDIA Corporation Device 2484 (rev a1)
01:00.1 Audio device: NVIDIA Corporation Device 228b (rev a1)
03:00.0 VGA compatible controller: NVIDIA Corporation Device 2484 (rev a1)
03:00.1 Audio device: NVIDIA Corporation Device 228b (rev a1)
05:00.0 VGA compatible controller: NVIDIA Corporation Device 2484 (rev a1)
05:00.1 Audio device: NVIDIA Corporation Device 228b (rev a1)
06:00.0 VGA compatible controller: NVIDIA Corporation Device 2484 (rev a1)
06:00.1 Audio device: NVIDIA Corporation Device 228b (rev a1)
07:00.0 VGA compatible controller: NVIDIA Corporation Device 2484 (rev a1)
07:00.1 Audio device: NVIDIA Corporation Device 228b (rev a1)
08:00.0 VGA compatible controller: NVIDIA Corporation Device 2484 (rev a1)
08:00.1 Audio device: NVIDIA Corporation Device 228b (rev a1)
So that is result of nvidia-smi command
admin2@admin2:~$ nvidia-smi
Thu Aug 5 14:58:39 2021
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.57.02 Driver Version: 470.57.02 CUDA Version: 11.4 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA GeForce ... On | 00000000:01:00.0 Off | N/A |
| 0% 34C P8 19W / 270W | 1MiB / 7979MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 1 NVIDIA GeForce ... On | 00000000:03:00.0 Off | N/A |
| 0% 27C P8 15W / 270W | 1MiB / 7982MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 2 NVIDIA GeForce ... On | 00000000:05:00.0 Off | N/A |
| 0% 29C P8 14W / 270W | 1MiB / 7982MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 3 NVIDIA GeForce ... On | 00000000:06:00.0 Off | N/A |
| 0% 28C P8 10W / 270W | 1MiB / 7982MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
admin2@admin2:~$
That is, it shows only 4 out of 6 video cards. All cards are on, they work. How can this problem be solved?
Moreover, I have exactly the same server next to it with exactly the same motherboard and cards and it shows 6 cards in nvidia-smi. Also on Ubuntu 18.04 lts, the nvidia-smi version is roughly the same (465.19.01 versus 470.57.02 on a server with 4 out of 6 cards). Can’t it be because of the version of the problem?
Thanks for answers!!
on other server nvidia-smi:
admin2@h1829658:~$ nvidia-smi
Thu Aug 5 14:36:13 2021
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 465.19.01 Driver Version: 465.19.01 CUDA Version: 11.3 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA GeForce ... On | 00000000:01:00.0 Off | N/A |
| 0% 27C P8 7W / 270W | 1MiB / 7982MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 1 NVIDIA GeForce ... On | 00000000:02:00.0 Off | N/A |
| 0% 27C P8 7W / 270W | 1MiB / 7982MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 2 NVIDIA GeForce ... On | 00000000:04:00.0 Off | N/A |
| 0% 28C P8 11W / 270W | 1MiB / 7982MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 3 NVIDIA GeForce ... On | 00000000:08:00.0 Off | N/A |
| 0% 30C P8 12W / 270W | 1MiB / 7982MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 4 NVIDIA GeForce ... On | 00000000:0A:00.0 Off | N/A |
| 0% 27C P8 6W / 270W | 1MiB / 7982MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 5 NVIDIA GeForce ... On | 00000000:0B:00.0 Off | N/A |
| 0% 26C P8 9W / 270W | 1MiB / 7982MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
Also upload here nvidia-bug-report.log from server
nvidia-bug-report.log (152.2 KB)