Device not found (Ubuntu 20.04 / Dell Precision / RTX A4000 / RmInitAdapter failed)

Hello,

I am unable to make my GPU work on Ubuntu 20.04 LTS.
The GPU is a RTX A4000

Here are my bug report and kern.log
The latter says:
Feb 8 07:35:47 loicus-DA kernel: [ 288.919473] NVRM: GPU 0000:01:00.0: RmInitAdapter failed! (0x23:0xffff:1401)
Feb 8 07:35:47 loicus-DA kernel: [ 288.919576] NVRM: GPU 0000:01:00.0: rm_init_adapter failed, device minor number 0
Feb 8 07:35:54 loicus-DA kernel: [ 296.096457] NVRM: Xid (PCI:0000:01:00): 79, pid=5156, GPU has fallen off the bus.
Feb 8 07:35:54 loicus-DA kernel: [ 296.096508] NVRM: GPU 0000:01:00.0: GPU has fallen off the bus.
Feb 8 07:35:54 loicus-DA kernel: [ 296.097573] NVRM: A GPU crash dump has been created. If possible, please run
Feb 8 07:35:54 loicus-DA kernel: [ 296.097573] NVRM: nvidia-bug-report.sh as root to collect this data before
Feb 8 07:35:54 loicus-DA kernel: [ 296.097573] NVRM: the NVIDIA kernel module is unloaded.

I tried to reinstall everything from scratch, reinstall the drivers in several different ways, etc…
Nothing is working… I suspect the GPU is dead, but I’d be thankful to get a confirmation

nvidia-bug-report.log (1.0 MB)
kern.log (139.0 KB)

Generix, please Help!!!

Thanks in advance,
Loic

Since this is a laptop, the gpu is not necessarily broken. It falls off the bus first which would point to a power management/bus/kernel problem. Please try

  • updating bios
  • setting kernel parameter intel_idle.max_cstate=1
  • use a different kernel
    The ubuntu 5.13 kernel was released with a lot of bugs, please check if you have a 5.11 kernel available in grub menu or try using the liquorix kernel ppa:
    https://launchpad.net/~damentz/+archive/ubuntu/liquorix
  • updating bios:
    It tried to do this, is this what you mean ?
loicus@loicus-DA:~$ sudo fwupdmgr refresh --force
Updating lvfs
Downloading…             [***************************************]
Successfully downloaded new metadata: 1 local device supported
loicus@loicus-DA:~$ sudo fwupdmgr update
Devices with no available firmware updates: 
 • PM9A1 NVMe Samsung 1024GB
 • PM9A1 NVMe Samsung 1024GB
 • UEFI Device Firmware
 • UEFI Device Firmware
 • UEFI dbx
Devices with the latest available firmware version:
 • System Firmware

I’ve set in my /etc/default/grub:

GRUB_CMDLINE_LINUX_DEFAULT="intel_idle.max_cstate=1"

Then I ran “sudo update-grub”

But it doesn’t change anything.

So I went further and install the liquorix kernel.
This leads to the following message from “nvidia-smi”:

NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.

That sounds like a step in the right direction to me, but we are not yet there :-)

I tried to uninstall and reinstall all nvidia stuff, but it didn’t help

sudo apt purge nvidia*
sudo ubuntu-drivers autoinstall

here is the new bug-report
nvidia-bug-report.log (471.4 KB)

Thanks for your help, it’s really appreciated!

Seems the kernel modules didn’t compile, please reinstall kernel headers
sudo apt install linux-headers-$(uname -r)
then post the output of
dkms status

loicus@loicus-DA:~$ sudo apt --reinstall install linux-headers-$(uname -r)
[sudo] password for loicus: 
Reading package lists... Done
Building dependency tree       
Reading state information... Done
The following packages were automatically installed and are no longer required:
  apt-clone archdetect-deb dmraid gir1.2-timezonemap-1.0 gir1.2-xkl-1.0 glib-networking:i386 gstreamer1.0-plugins-base:i386 kpartx kpartx-boot libapparmor1:i386 libargon2-1:i386 libasyncns0:i386
  libbrotli1:i386 libcairo2:i386 libcap2:i386 libcdparanoia0:i386 libdbus-1-3:i386 libdebian-installer4 libdevmapper1.02.1:i386 libdmraid1.0.0.rc16 libflac8:i386 libfontconfig1:i386 libfreetype6:i386
  libglib2.0-0:i386 libgmp10:i386 libgnutls30:i386 libgomp1:i386 libgssapi-krb5-2:i386 libgstreamer-plugins-base1.0-0:i386 libgstreamer1.0-0:i386 libhogweed5:i386 libice6:i386 libicu66:i386 libip4tc2:i386
  libjack-jackd2-0:i386 libjson-c4:i386 libjson-glib-1.0-0:i386 libk5crypto3:i386 libkeyutils1:i386 libkrb5-3:i386 libkrb5support0:i386 libltdl7:i386 libnettle7:i386 libogg0:i386 libopus0:i386
  liborc-0.4-0:i386 libp11-kit0:i386 libpixman-1-0:i386 libpng16-16:i386 libproxy1v5:i386 libpsl5:i386 libsamplerate0:i386 libseccomp2:i386 libsm6:i386 libsnapd-glib1:i386 libsndfile1:i386 libsoup2.4-1:i386
  libsoxr0:i386 libspeexdsp1:i386 libsqlite3-0:i386 libssl1.1:i386 libtasn1-6:i386 libtdb1:i386 libtheora0:i386 libtimezonemap-data libtimezonemap1 libvisual-0.4-0:i386 libvorbis0a:i386 libvorbisenc2:i386
  libwebrtc-audio-processing1:i386 libwrap0:i386 libxcb-render0:i386 libxml2:i386 libxrender1:i386 libxtst6:i386 linux-headers-5.13.0-1010-oem linux-image-5.13.0-1010-oem linux-modules-5.13.0-1010-oem
  linux-oem-5.13-headers-5.13.0-1010 python3-icu python3-pam rdate
Use 'sudo apt autoremove' to remove them.
0 upgraded, 0 newly installed, 1 reinstalled, 0 to remove and 0 not upgraded.
Need to get 0 B/12,0 MB of archives.
After this operation, 0 B of additional disk space will be used.
(Reading database ... 218744 files and directories currently installed.)
Preparing to unpack .../linux-headers-5.16.0-7.2-liquorix-amd64_5.16-5ubuntu1~focal_amd64.deb ...
Unpacking linux-headers-5.16.0-7.2-liquorix-amd64 (5.16-5ubuntu1~focal) over (5.16-5ubuntu1~focal) ...
Setting up linux-headers-5.16.0-7.2-liquorix-amd64 (5.16-5ubuntu1~focal) ...
/etc/kernel/header_postinst.d/dkms:
 * dkms: running auto installation service for kernel 5.16.0-7.2-liquorix-amd64
   ...done.
loicus@loicus-DA:~$ dkms status
loicus@loicus-DA:~$ 

Please post the output of
dpkg -l |grep nvidia

loicus@loicus-DA:~$ dpkg -l |grep nvidia
ii  libnvidia-cfg1-510:amd64                      510.47.03-0ubuntu0.20.04.1            amd64        NVIDIA binary OpenGL/GLX configuration library
ii  libnvidia-common-510                          510.47.03-0ubuntu0.20.04.1            all          Shared files used by the NVIDIA libraries
rc  libnvidia-compute-470:amd64                   470.103.01-0ubuntu0.20.04.1           amd64        NVIDIA libcompute package
rc  libnvidia-compute-470-server:amd64            470.103.01-0ubuntu0.20.04.1           amd64        NVIDIA libcompute package
ii  libnvidia-compute-510:amd64                   510.47.03-0ubuntu0.20.04.1            amd64        NVIDIA libcompute package
ii  libnvidia-compute-510:i386                    510.47.03-0ubuntu0.20.04.1            i386         NVIDIA libcompute package
ii  libnvidia-decode-510:amd64                    510.47.03-0ubuntu0.20.04.1            amd64        NVIDIA Video Decoding runtime libraries
ii  libnvidia-decode-510:i386                     510.47.03-0ubuntu0.20.04.1            i386         NVIDIA Video Decoding runtime libraries
ii  libnvidia-encode-510:amd64                    510.47.03-0ubuntu0.20.04.1            amd64        NVENC Video Encoding runtime library
ii  libnvidia-encode-510:i386                     510.47.03-0ubuntu0.20.04.1            i386         NVENC Video Encoding runtime library
ii  libnvidia-extra-510:amd64                     510.47.03-0ubuntu0.20.04.1            amd64        Extra libraries for the NVIDIA driver
ii  libnvidia-fbc1-510:amd64                      510.47.03-0ubuntu0.20.04.1            amd64        NVIDIA OpenGL-based Framebuffer Capture runtime library
ii  libnvidia-fbc1-510:i386                       510.47.03-0ubuntu0.20.04.1            i386         NVIDIA OpenGL-based Framebuffer Capture runtime library
ii  libnvidia-gl-510:amd64                        510.47.03-0ubuntu0.20.04.1            amd64        NVIDIA OpenGL/GLX/EGL/GLES GLVND libraries and Vulkan ICD
ii  libnvidia-gl-510:i386                         510.47.03-0ubuntu0.20.04.1            i386         NVIDIA OpenGL/GLX/EGL/GLES GLVND libraries and Vulkan ICD
ii  linux-modules-nvidia-510-5.13.0-1029-oem      5.13.0-1029.36+1                      amd64        Linux kernel nvidia modules for version 5.13.0-1029
ii  linux-modules-nvidia-510-oem-20.04c           5.13.0-1029.36+1                      amd64        Extra drivers for nvidia-510 for the oem-20.04c flavour
rc  linux-objects-nvidia-470-5.11.0-1028-aws      5.11.0-1028.31~20.04.1+1              amd64        Linux kernel nvidia modules for version 5.11.0-1028 (objects)
rc  linux-objects-nvidia-470-5.11.0-1028-azure    5.11.0-1028.31~20.04.2+1              amd64        Linux kernel nvidia modules for version 5.11.0-1028 (objects)
rc  linux-objects-nvidia-470-5.11.0-1028-oracle   5.11.0-1028.31~20.04.1+1              amd64        Linux kernel nvidia modules for version 5.11.0-1028 (objects)
rc  linux-objects-nvidia-470-5.11.0-1029-gcp      5.11.0-1029.33~20.04.3+1              amd64        Linux kernel nvidia modules for version 5.11.0-1029 (objects)
rc  linux-objects-nvidia-470-5.13.0-1012-aws      5.13.0-1012.13~20.04.1+1              amd64        Linux kernel nvidia modules for version 5.13.0-1012 (objects)
rc  linux-objects-nvidia-470-5.13.0-1013-azure    5.13.0-1013.15~20.04.1+1              amd64        Linux kernel nvidia modules for version 5.13.0-1013 (objects)
rc  linux-objects-nvidia-470-5.13.0-1013-gcp      5.13.0-1013.16~20.04.1+1              amd64        Linux kernel nvidia modules for version 5.13.0-1013 (objects)
rc  linux-objects-nvidia-470-5.13.0-1016-oracle   5.13.0-1016.20~20.04.1+1              amd64        Linux kernel nvidia modules for version 5.13.0-1016 (objects)
rc  linux-objects-nvidia-470-5.13.0-1029-oem      5.13.0-1029.36+1                      amd64        Linux kernel nvidia modules for version 5.13.0-1029 (objects)
rc  linux-objects-nvidia-470-5.13.0-28-generic    5.13.0-28.31~20.04.1+2                amd64        Linux kernel nvidia modules for version 5.13.0-28 (objects)
rc  linux-objects-nvidia-470-5.13.0-28-lowlatency 5.13.0-28.31~20.04.1+2                amd64        Linux kernel nvidia modules for version 5.13.0-28 (objects)
rc  linux-objects-nvidia-470-5.4.0-1062-oracle    5.4.0-1062.66+1                       amd64        Linux kernel nvidia modules for version 5.4.0-1062 (objects)
rc  linux-objects-nvidia-470-5.4.0-1063-gcp       5.4.0-1063.67+1                       amd64        Linux kernel nvidia modules for version 5.4.0-1063 (objects)
rc  linux-objects-nvidia-470-5.4.0-1064-aws       5.4.0-1064.67+1                       amd64        Linux kernel nvidia modules for version 5.4.0-1064 (objects)
rc  linux-objects-nvidia-470-5.4.0-1068-azure     5.4.0-1068.71+1                       amd64        Linux kernel nvidia modules for version 5.4.0-1068 (objects)
rc  linux-objects-nvidia-470-5.4.0-99-generic     5.4.0-99.112+1                        amd64        Linux kernel nvidia modules for version 5.4.0-99 (objects)
rc  linux-objects-nvidia-470-5.4.0-99-lowlatency  5.4.0-99.112+1                        amd64        Linux kernel nvidia modules for version 5.4.0-99 (objects)
ii  linux-objects-nvidia-510-5.13.0-1029-oem      5.13.0-1029.36+1                      amd64        Linux kernel nvidia modules for version 5.13.0-1029 (objects)
ii  linux-signatures-nvidia-5.13.0-1029-oem       5.13.0-1029.36+1                      amd64        Linux kernel signatures for nvidia modules for version 5.13.0-1029-oem
rc  nvidia-compute-utils-470-server               470.103.01-0ubuntu0.20.04.1           amd64        NVIDIA compute utilities
ii  nvidia-compute-utils-510                      510.47.03-0ubuntu0.20.04.1            amd64        NVIDIA compute utilities
rc  nvidia-dkms-470-server                        470.103.01-0ubuntu0.20.04.1           amd64        NVIDIA DKMS package
ii  nvidia-driver-510                             510.47.03-0ubuntu0.20.04.1            amd64        NVIDIA driver metapackage
rc  nvidia-kernel-common-470-server               470.103.01-0ubuntu0.20.04.1           amd64        Shared files used with the kernel module
ii  nvidia-kernel-common-510                      510.47.03-0ubuntu0.20.04.1            amd64        Shared files used with the kernel module
ii  nvidia-kernel-source-510                      510.47.03-0ubuntu0.20.04.1            amd64        NVIDIA kernel source package
ii  nvidia-prime                                  0.8.16~0.20.04.1                      all          Tools to enable NVIDIA's Prime
ii  nvidia-settings                               470.57.01-0ubuntu0.20.04.2            amd64        Tool for configuring the NVIDIA graphics driver
ii  nvidia-utils-510                              510.47.03-0ubuntu0.20.04.1            amd64        NVIDIA driver support binaries
ii  screen-resolution-extra                       0.18build1                            all          Extension for the nvidia-settings control panel
ii  xserver-xorg-video-nvidia-510                 510.47.03-0ubuntu0.20.04.1            amd64        NVIDIA binary Xorg driver

I imagine that should purge all the stuff that is not 510 ?

Yes, it’s a wild mix of 470-server and 510, neither driver being complete. rather remove everything *nvidia* and reinstall using Software&Updates application.

loicus@loicus-DA:~$ dpkg -l |grep nvidia
ii  libnvidia-cfg1-510:amd64                   510.47.03-0ubuntu0.20.04.1            amd64        NVIDIA binary OpenGL/GLX configuration library
ii  libnvidia-common-510                       510.47.03-0ubuntu0.20.04.1            all          Shared files used by the NVIDIA libraries
ii  libnvidia-compute-510:amd64                510.47.03-0ubuntu0.20.04.1            amd64        NVIDIA libcompute package
ii  libnvidia-compute-510:i386                 510.47.03-0ubuntu0.20.04.1            i386         NVIDIA libcompute package
ii  libnvidia-decode-510:amd64                 510.47.03-0ubuntu0.20.04.1            amd64        NVIDIA Video Decoding runtime libraries
ii  libnvidia-decode-510:i386                  510.47.03-0ubuntu0.20.04.1            i386         NVIDIA Video Decoding runtime libraries
ii  libnvidia-encode-510:amd64                 510.47.03-0ubuntu0.20.04.1            amd64        NVENC Video Encoding runtime library
ii  libnvidia-encode-510:i386                  510.47.03-0ubuntu0.20.04.1            i386         NVENC Video Encoding runtime library
ii  libnvidia-extra-510:amd64                  510.47.03-0ubuntu0.20.04.1            amd64        Extra libraries for the NVIDIA driver
ii  libnvidia-fbc1-510:amd64                   510.47.03-0ubuntu0.20.04.1            amd64        NVIDIA OpenGL-based Framebuffer Capture runtime library
ii  libnvidia-fbc1-510:i386                    510.47.03-0ubuntu0.20.04.1            i386         NVIDIA OpenGL-based Framebuffer Capture runtime library
ii  libnvidia-gl-510:amd64                     510.47.03-0ubuntu0.20.04.1            amd64        NVIDIA OpenGL/GLX/EGL/GLES GLVND libraries and Vulkan ICD
ii  libnvidia-gl-510:i386                      510.47.03-0ubuntu0.20.04.1            i386         NVIDIA OpenGL/GLX/EGL/GLES GLVND libraries and Vulkan ICD
ii  nvidia-compute-utils-510                   510.47.03-0ubuntu0.20.04.1            amd64        NVIDIA compute utilities
ii  nvidia-dkms-510                            510.47.03-0ubuntu0.20.04.1            amd64        NVIDIA DKMS package
ii  nvidia-driver-510                          510.47.03-0ubuntu0.20.04.1            amd64        NVIDIA driver metapackage
ii  nvidia-kernel-common-510                   510.47.03-0ubuntu0.20.04.1            amd64        Shared files used with the kernel module
ii  nvidia-kernel-source-510                   510.47.03-0ubuntu0.20.04.1            amd64        NVIDIA kernel source package
ii  nvidia-prime                               0.8.16~0.20.04.1                      all          Tools to enable NVIDIA's Prime
ii  nvidia-settings                            470.57.01-0ubuntu0.20.04.2            amd64        Tool for configuring the NVIDIA graphics driver
ii  nvidia-utils-510                           510.47.03-0ubuntu0.20.04.1            amd64        NVIDIA driver support binaries
ii  screen-resolution-extra                    0.18build1                            all          Extension for the nvidia-settings control panel
ii  xserver-xorg-video-nvidia-510              510.47.03-0ubuntu0.20.04.1            amd64        NVIDIA binary Xorg driver

loicus@loicus-DA:~$ dkms status
nvidia, 510.47.03, 5.16.0-7.2-liquorix-amd64, x86_64: installed

I rebooted at this point

loicus@loicus-DA:~$ sudo nvidia-smi
No devices were found

Please create a new nvidia-bug-report.log

nvidia-bug-report.log (1.4 MB)

Same error.
I guess you’ll have to cross-check for the gpu to be dead by installing Windows now.

arf… I guess I can pick any version ?

Just use a Windows 10 image from Microsoft, fetch nvidia drivers from dell website, install and check if Windows device manager reports “Code 43”.

I have indeed a code43 after installing latest driver and rebooting.
If I try to open the nvidia control panel, nothing happens and when I try to open the NVIDIA RTX Desktop Manager I get an error saying that I should at least have a RTX GPU

I guess this confirms that the GPU is dead ?

Thanks for helping
Loic

Yes, it’s dead, sorry. Hope your device is still under warranty.

The computer (and its GPU) is brand new…
what a shame that Dell sold me this…

Thanks a lot for your help generix,
I will now start hassling the commercial team at Dell

@user157369 I have same jssue with Dell Precision 3650 with RTX A4000. Did you solved this problem or does Dell support have any solution? Thanks you.

Yes… I asked Dell for reimbursement and got a Lenovo that worked out of the box without any fancy configuration, driver or whatever.

Good luck with Dell technical support,
My experience was awful. They were not even aware that they are selling Ubuntu laptops.

All the best
LoĂŻc

1 Like