I have a problem, but i don’t know what problem. I want someone to help me with that step by step. Here is some diagnostic command results i found trying to fix the issues.
I need GPU to train NNs. I already tried to reinstall drivers:
- At first a deleted all drivers with sudo apt remove --purge '^nvidia-.’ → sudo apt autoremove --purge (also i did udo apt-get purge nvidia libnvidia* linux-modules-nvidia* )
- Then i installed using sudo ubuntu-drivers install (without specifing version)
Idk, may be my card i dead or version of Ubuntu 22.04 is problematic. May be it is old bios, idk.
~ » nvidia-smi raph@raphtop
NVIDIA-SMI has failed because it couldn’t communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.
~ »
~ » nvtop 9 ↵ raph@raphtop
No GPU to monitor.
~ » raph@raphtop
~ » dpkg -l | grep -i nvidia raph@raphtop
ii libnvidia-cfg1-565:amd64 565.57.01-0ubuntu1 amd64 NVIDIA binary OpenGL/GLX configuration library
ii libnvidia-common-565 565.57.01-0ubuntu1 all Shared files used by the NVIDIA libraries
ii libnvidia-compute-565:amd64 565.57.01-0ubuntu1 amd64 NVIDIA libcompute package
ii libnvidia-compute-565:i386 565.57.01-0ubuntu1 i386 NVIDIA libcompute package
ii libnvidia-decode-565:amd64 565.57.01-0ubuntu1 amd64 NVIDIA Video Decoding runtime libraries
ii libnvidia-decode-565:i386 565.57.01-0ubuntu1 i386 NVIDIA Video Decoding runtime libraries
ii libnvidia-encode-565:amd64 565.57.01-0ubuntu1 amd64 NVENC Video Encoding runtime library
ii libnvidia-encode-565:i386 565.57.01-0ubuntu1 i386 NVENC Video Encoding runtime library
ii libnvidia-extra-565:amd64 565.57.01-0ubuntu1 amd64 Extra libraries for the NVIDIA driver
ii libnvidia-fbc1-565:amd64 565.57.01-0ubuntu1 amd64 NVIDIA OpenGL-based Framebuffer Capture runtime library
ii libnvidia-fbc1-565:i386 565.57.01-0ubuntu1 i386 NVIDIA OpenGL-based Framebuffer Capture runtime library
ii libnvidia-gl-565:amd64 565.57.01-0ubuntu1 amd64 NVIDIA OpenGL/GLX/EGL/GLES GLVND libraries and Vulkan ICD
ii libnvidia-gl-565:i386 565.57.01-0ubuntu1 i386 NVIDIA OpenGL/GLX/EGL/GLES GLVND libraries and Vulkan ICD
rc linux-objects-nvidia-535-6.8.0-40-generic 6.8.0-40.40~22.04.3+1 amd64 Linux kernel nvidia modules for version 6.8.0-40 (objects)
rc linux-objects-nvidia-535-6.8.0-45-generic 6.8.0-45.45~22.04.1 amd64 Linux kernel nvidia modules for version 6.8.0-45 (objects)
rc linux-objects-nvidia-535-6.8.0-47-generic 6.8.0-47.47~22.04.1+1 amd64 Linux kernel nvidia modules for version 6.8.0-47 (objects)
rc linux-objects-nvidia-550-6.8.0-47-generic 6.8.0-47.47~22.04.1+1 amd64 Linux kernel nvidia modules for version 6.8.0-47 (objects)
ii nvidia-compute-utils-565 565.57.01-0ubuntu1 amd64 NVIDIA compute utilities
ii nvidia-dkms-565 565.57.01-0ubuntu1 amd64 NVIDIA DKMS package
ii nvidia-driver-565 565.57.01-0ubuntu1 amd64 NVIDIA driver metapackage
ii nvidia-firmware-565-565.57.01 565.57.01-0ubuntu1 amd64 Firmware files used by the kernel module
ii nvidia-kernel-common-565 565.57.01-0ubuntu1 amd64 Shared files used with the kernel module
ii nvidia-kernel-source-565 565.57.01-0ubuntu1 amd64 NVIDIA kernel source package
ii nvidia-modprobe 565.57.01-0ubuntu1 amd64 Load the NVIDIA kernel driver and create device files
ii nvidia-prime 0.8.17.1 all Tools to enable NVIDIA’s Prime
ii nvidia-settings 565.57.01-0ubuntu1 amd64 Tool for configuring the NVIDIA graphics driver
ii nvidia-utils-565 565.57.01-0ubuntu1 amd64 NVIDIA driver support binaries
ii nvtop 1.2.2-1 amd64 Interactive NVIDIA GPU process monitor
ii screen-resolution-extra 0.18.2 all Extension for the nvidia-settings control panel
ii xserver-xorg-video-nvidia-565 565.57.01-0ubuntu1 amd64 NVIDIA binary Xorg driver
~ » raph@raphtop
~ » lspci | grep -i nvidia raph@raphtop
01:00.0 3D controller: NVIDIA Corporation GP107M [GeForce GTX 1050 Mobile] (rev a1)
~ » lspci raph@raphtop
00:00.0 Host bridge: Intel Corporation Xeon E3-1200 v6/7th Gen Core Processor Host Bridge/DRAM Registers (rev 05)
00:01.0 PCI bridge: Intel Corporation 6th-10th Gen Core Processor PCIe Controller (x16) (rev 05)
00:02.0 VGA compatible controller: Intel Corporation HD Graphics 630 (rev 04)
00:08.0 System peripheral: Intel Corporation Xeon E3-1200 v5/v6 / E3-1500 v5 / 6th/7th/8th Gen Core Processor Gaussian Mixture Model
00:14.0 USB controller: Intel Corporation 100 Series/C230 Series Chipset Family USB 3.0 xHCI Controller (rev 31)
00:14.2 Signal processing controller: Intel Corporation 100 Series/C230 Series Chipset Family Thermal Subsystem (rev 31)
00:15.0 Signal processing controller: Intel Corporation 100 Series/C230 Series Chipset Family Serial IO I2C Controller #0 (rev 31)
00:16.0 Communication controller: Intel Corporation 100 Series/C230 Series Chipset Family MEI Controller #1 (rev 31)
00:17.0 SATA controller: Intel Corporation HM170/QM170 Chipset SATA Controller [AHCI Mode] (rev 31)
00:1c.0 PCI bridge: Intel Corporation 100 Series/C230 Series Chipset Family PCI Express Root Port #3 (rev f1)
00:1c.3 PCI bridge: Intel Corporation 100 Series/C230 Series Chipset Family PCI Express Root Port #4 (rev f1)
00:1c.6 PCI bridge: Intel Corporation 100 Series/C230 Series Chipset Family PCI Express Root Port #7 (rev f1)
00:1f.0 ISA bridge: Intel Corporation HM175 Chipset LPC/eSPI Controller (rev 31)
00:1f.2 Memory controller: Intel Corporation 100 Series/C230 Series Chipset Family Power Management Controller (rev 31)
00:1f.3 Audio device: Intel Corporation CM238 HD Audio Controller (rev 31)
00:1f.4 SMBus: Intel Corporation 100 Series/C230 Series Chipset Family SMBus (rev 31)
01:00.0 3D controller: NVIDIA Corporation GP107M [GeForce GTX 1050 Mobile] (rev a1)
02:00.0 Network controller: Intel Corporation Wireless 7265 (rev 59)
03:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 15)
04:00.0 Unassigned class [ff00]: Realtek Semiconductor Co., Ltd. RTS5229 PCI Express Card Reader (rev 01)
~ » raph@raphtop
~ » lsmod | grep nvidia raph@raphtop
~ »
~ » lspci -v raph@raphtop
00:00.0 Host bridge: Intel Corporation Xeon E3-1200 v6/7th Gen Core Processor Host Bridge/DRAM Registers (rev 05)
Subsystem: ASUSTeK Computer Inc. Xeon E3-1200 v6/7th Gen Core Processor Host Bridge/DRAM Registers
Flags: bus master, fast devsel, latency 0, IOMMU group 1
Capabilities:
Kernel driver in use: skl_uncore
00:01.0 PCI bridge: Intel Corporation 6th-10th Gen Core Processor PCIe Controller (x16) (rev 05) (prog-if 00 [Normal decode])
Flags: bus master, fast devsel, latency 0, IRQ 122, IOMMU group 2
Bus: primary=00, secondary=01, subordinate=01, sec-latency=0
I/O behind bridge: 0000e000-0000efff [size=4K]
Memory behind bridge: de000000-df0fffff [size=17M]
Prefetchable memory behind bridge: 00000000c0000000-00000000d1ffffff [size=288M]
Capabilities:
Kernel driver in use: pcieport
00:02.0 VGA compatible controller: Intel Corporation HD Graphics 630 (rev 04) (prog-if 00 [VGA controller])
DeviceName: Onboard IGD
Subsystem: ASUSTeK Computer Inc. HD Graphics 630
Flags: bus master, fast devsel, latency 0, IRQ 136, IOMMU group 0
Memory at dd000000 (64-bit, non-prefetchable) [size=16M]
Memory at b0000000 (64-bit, prefetchable) [size=256M]
I/O ports at f000 [size=64]
Expansion ROM at 000c0000 [virtual] [disabled] [size=128K]
Capabilities:
Kernel driver in use: i915
Kernel modules: i915
00:08.0 System peripheral: Intel Corporation Xeon E3-1200 v5/v6 / E3-1500 v5 / 6th/7th/8th Gen Core Processor Gaussian Mixture Model
Subsystem: ASUSTeK Computer Inc. Xeon E3-1200 v5/v6 / E3-1500 v5 / 6th/7th/8th Gen Core Processor Gaussian Mixture Model
Flags: fast devsel, IRQ 255, IOMMU group 3
Memory at df430000 (64-bit, non-prefetchable) [disabled] [size=4K]
Capabilities:
00:14.0 USB controller: Intel Corporation 100 Series/C230 Series Chipset Family USB 3.0 xHCI Controller (rev 31) (prog-if 30 [XHCI])
Subsystem: ASUSTeK Computer Inc. 100 Series/C230 Series Chipset Family USB 3.0 xHCI Controller
Flags: bus master, medium devsel, latency 0, IRQ 128, IOMMU group 4
Memory at df410000 (64-bit, non-prefetchable) [size=64K]
Capabilities:
Kernel driver in use: xhci_hcd
Kernel modules: xhci_pci
00:14.2 Signal processing controller: Intel Corporation 100 Series/C230 Series Chipset Family Thermal Subsystem (rev 31)
Subsystem: ASUSTeK Computer Inc. 100 Series/C230 Series Chipset Family Thermal Subsystem
Flags: fast devsel, IRQ 18, IOMMU group 4
Memory at df42f000 (64-bit, non-prefetchable) [size=4K]
Capabilities:
Kernel driver in use: intel_pch_thermal
Kernel modules: intel_pch_thermal
00:15.0 Signal processing controller: Intel Corporation 100 Series/C230 Series Chipset Family Serial IO I2C Controller #0 (rev 31)
Subsystem: ASUSTeK Computer Inc. 100 Series/C230 Series Chipset Family Serial IO I2C Controller
Flags: bus master, fast devsel, latency 0, IRQ 16, IOMMU group 5
Memory at df42e000 (64-bit, non-prefetchable) [size=4K]
Capabilities:
Kernel driver in use: intel-lpss
Kernel modules: intel_lpss_pci
00:16.0 Communication controller: Intel Corporation 100 Series/C230 Series Chipset Family MEI Controller #1 (rev 31)
Subsystem: ASUSTeK Computer Inc. 100 Series/C230 Series Chipset Family MEI Controller
Flags: bus master, fast devsel, latency 0, IRQ 134, IOMMU group 6
Memory at df42d000 (64-bit, non-prefetchable) [size=4K]
Capabilities:
Kernel driver in use: mei_me
Kernel modules: mei_me
00:17.0 SATA controller: Intel Corporation HM170/QM170 Chipset SATA Controller [AHCI Mode] (rev 31) (prog-if 01 [AHCI 1.0])
Subsystem: ASUSTeK Computer Inc. HM170/QM170 Chipset SATA Controller [AHCI Mode]
Flags: bus master, 66MHz, medium devsel, latency 0, IRQ 126, IOMMU group 7
Memory at df428000 (32-bit, non-prefetchable) [size=8K]
Memory at df42c000 (32-bit, non-prefetchable) [size=256]
I/O ports at f090 [size=8]
I/O ports at f080 [size=4]
I/O ports at f060 [size=32]
Memory at df42b000 (32-bit, non-prefetchable) [size=2K]
Capabilities:
Kernel driver in use: ahci
Kernel modules: ahci
00:1c.0 PCI bridge: Intel Corporation 100 Series/C230 Series Chipset Family PCI Express Root Port #3 (rev f1) (prog-if 00 [Normal decode])
Flags: bus master, fast devsel, latency 0, IRQ 123, IOMMU group 8
Bus: primary=00, secondary=02, subordinate=02, sec-latency=0
I/O behind bridge: [disabled]
Memory behind bridge: df300000-df3fffff [size=1M]
Prefetchable memory behind bridge: [disabled]
Capabilities:
Kernel driver in use: pcieport
00:1c.3 PCI bridge: Intel Corporation 100 Series/C230 Series Chipset Family PCI Express Root Port #4 (rev f1) (prog-if 00 [Normal decode])
Flags: bus master, fast devsel, latency 0, IRQ 124, IOMMU group 9
Bus: primary=00, secondary=03, subordinate=03, sec-latency=0
I/O behind bridge: 0000d000-0000dfff [size=4K]
Memory behind bridge: df200000-df2fffff [size=1M]
Prefetchable memory behind bridge: [disabled]
Capabilities:
Kernel driver in use: pcieport
00:1c.6 PCI bridge: Intel Corporation 100 Series/C230 Series Chipset Family PCI Express Root Port #7 (rev f1) (prog-if 00 [Normal decode])
Flags: bus master, fast devsel, latency 0, IRQ 125, IOMMU group 10
Bus: primary=00, secondary=04, subordinate=04, sec-latency=0
I/O behind bridge: [disabled]
Memory behind bridge: df100000-df1fffff [size=1M]
Prefetchable memory behind bridge: [disabled]
Capabilities:
Kernel driver in use: pcieport
00:1f.0 ISA bridge: Intel Corporation HM175 Chipset LPC/eSPI Controller (rev 31)
Subsystem: ASUSTeK Computer Inc. HM175 Chipset LPC/eSPI Controller
Flags: bus master, medium devsel, latency 0, IOMMU group 11
00:1f.2 Memory controller: Intel Corporation 100 Series/C230 Series Chipset Family Power Management Controller (rev 31)
Subsystem: ASUSTeK Computer Inc. 100 Series/C230 Series Chipset Family Power Management Controller
Flags: fast devsel, IOMMU group 11
Memory at df424000 (32-bit, non-prefetchable) [disabled] [size=16K]
00:1f.3 Audio device: Intel Corporation CM238 HD Audio Controller (rev 31) (prog-if 80)
Subsystem: ASUSTeK Computer Inc. CM238 HD Audio Controller
Flags: bus master, fast devsel, latency 32, IRQ 137, IOMMU group 11
Memory at df420000 (64-bit, non-prefetchable) [size=16K]
Memory at df400000 (64-bit, non-prefetchable) [size=64K]
Capabilities:
Kernel driver in use: snd_hda_intel
Kernel modules: snd_hda_intel, snd_soc_avs
00:1f.4 SMBus: Intel Corporation 100 Series/C230 Series Chipset Family SMBus (rev 31)
Subsystem: ASUSTeK Computer Inc. 100 Series/C230 Series Chipset Family SMBus
Flags: medium devsel, IRQ 16, IOMMU group 11
Memory at df42a000 (64-bit, non-prefetchable) [size=256]
I/O ports at f040 [size=32]
Kernel driver in use: i801_smbus
Kernel modules: i2c_i801
01:00.0 3D controller: NVIDIA Corporation GP107M [GeForce GTX 1050 Mobile] (rev a1)
Subsystem: ASUSTeK Computer Inc. GP107M [GeForce GTX 1050 Mobile]
Flags: bus master, fast devsel, latency 0, IRQ 255, IOMMU group 2
Memory at de000000 (32-bit, non-prefetchable) [size=16M]
Memory at c0000000 (64-bit, prefetchable) [size=256M]
Memory at d0000000 (64-bit, prefetchable) [size=32M]
I/O ports at e000 [disabled] [size=128]
Expansion ROM at df000000 [disabled] [size=512K]
Capabilities:
Kernel modules: nvidiafb, nouveau, nvidia_drm, nvidia
02:00.0 Network controller: Intel Corporation Wireless 7265 (rev 59)
Subsystem: Intel Corporation Dual Band Wireless-AC 7265
Flags: bus master, fast devsel, latency 0, IRQ 135, IOMMU group 12
Memory at df300000 (64-bit, non-prefetchable) [size=8K]
Capabilities:
Kernel driver in use: iwlwifi
Kernel modules: iwlwifi
03:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 15)
Subsystem: ASUSTeK Computer Inc. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller
Flags: bus master, fast devsel, latency 0, IRQ 19, IOMMU group 13
I/O ports at d000 [size=256]
Memory at df204000 (64-bit, non-prefetchable) [size=4K]
Memory at df200000 (64-bit, non-prefetchable) [size=16K]
Capabilities:
Kernel driver in use: r8169
Kernel modules: r8169
04:00.0 Unassigned class [ff00]: Realtek Semiconductor Co., Ltd. RTS5229 PCI Express Card Reader (rev 01)
Subsystem: ASUSTeK Computer Inc. RTS5229 PCI Express Card Reader
Flags: bus master, fast devsel, latency 0, IRQ 127, IOMMU group 14
Memory at df100000 (32-bit, non-prefetchable) [size=4K]
Capabilities:
Kernel driver in use: rtsx_pci
Kernel modules: rtsx_pci
~ » raph@raphtop