Hi,
today my drivers stopped working.
here is the debug log printed by sudo nvidia-bug-report.sh
nvidia-bug-report.log.gz (152.7 KB)
Hi,
today my drivers stopped working.
here is the debug log printed by sudo nvidia-bug-report.sh
nvidia-bug-report.log.gz (152.7 KB)
here is my history of commands i have tried
debug-history.txt (12.5 KB)
Please post the output of
dpkg -l | grep nvidia
and
ls -l /usr/lib/x86_64-linux-gnu/
dpkg -l | grep nvidia:
dpkg.txt (3.4 KB)
(sorry i can only put one link at a time as new user)
/usr/lib/x86_64-linux-gnu/:
x86_64-linux-gnu.txt (289.0 KB)
this could be interesting as well:
cat: /proc/driver/nvidia/version: No such file or directory
cat: /sys/module/nvidia/version: No such file or directory
Looks fine, packages and files have the correct versions.
Please post the output of
dkms status
and
sudo modprobe nvidia
dkms status
nvidia, 510.73.05, 5.13.0-41-generic, x86_64: installed
rts_pstor, 1.11: added
v4l2loopback, 0.12.3, 5.13.0-40-generic, x86_64: installed
v4l2loopback, 0.12.3, 5.13.0-41-generic, x86_64: installed
virtualbox, 6.1.32, 5.13.0-40-generic, x86_64: installed
virtualbox, 6.1.32, 5.13.0-41-generic, x86_64: installed
sudo modprobe nvidia -vv
modprobe: INFO: ../libkmod/libkmod.c:365 kmod_set_log_fn() custom logging function 0x557dbc565b90 registered
insmod /lib/modules/5.13.0-41-generic/updates/dkms/nvidia.ko
modprobe: INFO: ../libkmod/libkmod-module.c:892 kmod_module_insert_module() Failed to insert module '/lib/modules/5.13.0-41-generic/updates/dkms/nvidia.ko': Exec format error
modprobe: ERROR: could not insert 'nvidia': Exec format error
modprobe: INFO: ../libkmod/libkmod.c:332 kmod_unref() context 0x557dbd0d7490 released
for completions sake:
lspci -k | grep -A 2 -E "(VGA|3D)"
26:00.0 VGA compatible controller: NVIDIA Corporation TU104 [GeForce RTX 2070 SUPER] (rev a1)
Subsystem: NVIDIA Corporation TU104 [GeForce RTX 2070 SUPER]
Kernel modules: nvidiafb, nouveau, nvidia_drm, nvidia
dmesg | grep nvidia
[ 0.553126] nvidia-gpu 0000:26:00.3: enabling device (0000 -> 0002)
[ 4.546545] nvidiafb 0000:26:00.0: BAR 1: can't reserve [mem 0xe0000000-0xefffffff 64bit pref]
[ 4.546550] nvidiafb: cannot request PCI regions
[ 5.064026] nvidia: loading out-of-tree module taints kernel.
[ 5.064040] nvidia: module license 'NVIDIA' taints kernel.
[ 5.077167] nvidia: module verification failed: signature and/or required key missing - tainting kernel
[ 5.573029] nvidia-gpu 0000:26:00.3: i2c timeout error e0000000
[ 7.426170] audit: type=1400 audit(1653048736.544:8): apparmor="STATUS" operation="profile_load" profile="unconfined" name="nvidia_modprobe" pid=1150 comm="apparmor_parser"
[ 7.426173] audit: type=1400 audit(1653048736.544:9): apparmor="STATUS" operation="profile_load" profile="unconfined" name="nvidia_modprobe//kmod" pid=1150 comm="apparmor_parser"
lsmod | grep nvidia
nvidiafb 53248 0
vgastate 20480 1 nvidiafb
fb_ddc 16384 1 nvidiafb
i2c_algo_bit 16384 1 nvidiafb
i2c_nvidia_gpu 16384 0
Please blacklist nvidiafb, then try rebuilding the nvidia driver:
sudo dkms remove nvidia/510.73.05
sudo dkms install nvidia/510.73.05
after reboot post the output of
sudo modprobe nvidia
Please also post the output of
cc -v
nvidiafb
was already blacklisted
but now I did it like you told the other guy here:
sudo dkms remove nvidia/510.73.05
Error! Invalid number of parameters passed.
Usage: remove <module>/<module-version> --all
or: remove <module>/<module-version> -k <kernel-version>
so I did sudo dkms remove nvidia/510.73.05 --all
-------- Uninstall Beginning --------
Module: nvidia
Version: 510.73.05
Kernel: 5.13.0-41-generic (x86_64)
-------------------------------------
Status: Before uninstall, this module version was ACTIVE on this kernel.
nvidia.ko:
- Uninstallation
- Deleting from: /lib/modules/5.13.0-41-generic/updates/dkms/
- Original module
- No original module was found for this module on this kernel.
- Use the dkms install command to reinstall any previous module version.
nvidia-modeset.ko:
- Uninstallation
- Deleting from: /lib/modules/5.13.0-41-generic/updates/dkms/
- Original module
- No original module was found for this module on this kernel.
- Use the dkms install command to reinstall any previous module version.
nvidia-drm.ko:
- Uninstallation
- Deleting from: /lib/modules/5.13.0-41-generic/updates/dkms/
- Original module
- No original module was found for this module on this kernel.
- Use the dkms install command to reinstall any previous module version.
nvidia-uvm.ko:
- Uninstallation
- Deleting from: /lib/modules/5.13.0-41-generic/updates/dkms/
- Original module
- No original module was found for this module on this kernel.
- Use the dkms install command to reinstall any previous module version.
nvidia-peermem.ko:
- Uninstallation
- Deleting from: /lib/modules/5.13.0-41-generic/updates/dkms/
- Original module
- No original module was found for this module on this kernel.
- Use the dkms install command to reinstall any previous module version.
depmod....
DKMS: uninstall completed.
------------------------------
Deleting module version: 510.73.05
completely from the DKMS tree.
------------------------------
Done.
sudo dkms install nvidia/510.73.05
Creating symlink /var/lib/dkms/nvidia/510.73.05/source ->
/usr/src/nvidia-510.73.05
DKMS: add completed.
Kernel preparation unnecessary for this kernel. Skipping...
applying patch disable_fstack-clash-protection_fcf-protection.patch...patching file Kbuild
Hunk #1 succeeded at 82 (offset 11 lines).
Building module:
cleaning build area...
unset ARCH; [ ! -h /usr/bin/cc ] && export CC=/usr/bin/gcc; env NV_VERBOSE=1 'make' -j16 NV_EXCLUDE_BUILD_MODULES='' KERNEL_UNAME=5.13.0-41-generic IGNORE_XEN_PRESENCE=1 IGNORE_CC_MISMATCH=1 SYSSRC=/lib/modules/5.13.0-41-generic/build LD=/usr/bin/ld.bfd modules........
Signing module:
- /var/lib/dkms/nvidia/510.73.05/5.13.0-41-generic/x86_64/module/nvidia-drm.ko
- /var/lib/dkms/nvidia/510.73.05/5.13.0-41-generic/x86_64/module/nvidia-uvm.ko
- /var/lib/dkms/nvidia/510.73.05/5.13.0-41-generic/x86_64/module/nvidia-modeset.ko
- /var/lib/dkms/nvidia/510.73.05/5.13.0-41-generic/x86_64/module/nvidia.ko
- /var/lib/dkms/nvidia/510.73.05/5.13.0-41-generic/x86_64/module/nvidia-peermem.ko
Secure Boot not enabled on this system.
cleaning build area...
DKMS: build completed.
nvidia.ko:
Running module version sanity check.
- Original module
- No original module exists within this kernel
- Installation
- Installing to /lib/modules/5.13.0-41-generic/updates/dkms/
nvidia-modeset.ko:
Running module version sanity check.
- Original module
- No original module exists within this kernel
- Installation
- Installing to /lib/modules/5.13.0-41-generic/updates/dkms/
nvidia-drm.ko:
Running module version sanity check.
- Original module
- No original module exists within this kernel
- Installation
- Installing to /lib/modules/5.13.0-41-generic/updates/dkms/
nvidia-uvm.ko:
Running module version sanity check.
- Original module
- No original module exists within this kernel
- Installation
- Installing to /lib/modules/5.13.0-41-generic/updates/dkms/
nvidia-peermem.ko:
Running module version sanity check.
- Original module
- No original module exists within this kernel
- Installation
- Installing to /lib/modules/5.13.0-41-generic/updates/dkms/
depmod...
DKMS: install completed.
rebooted
sudo modprobe nvidia -vv
modprobe: INFO: ../libkmod/libkmod.c:365 kmod_set_log_fn() custom logging function 0x5562d6da7b90 registered
insmod /lib/modules/5.13.0-41-generic/updates/dkms/nvidia.ko
modprobe: INFO: ../libkmod/libkmod-module.c:892 kmod_module_insert_module() Failed to insert module '/lib/modules/5.13.0-41-generic/updates/dkms/nvidia.ko': Exec format error
modprobe: ERROR: could not insert 'nvidia': Exec format error
modprobe: INFO: ../libkmod/libkmod.c:332 kmod_unref() context 0x5562d72f8490 released
cc -v
Using built-in specs.
COLLECT_GCC=cc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/9/lto-wrapper
OFFLOAD_TARGET_NAMES=nvptx-none:hsa
OFFLOAD_TARGET_DEFAULT=1
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu 9.4.0-1ubuntu1~20.04.1' --with-bugurl=file:///usr/share/doc/gcc-9/README.Bugs --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,gm2 --prefix=/usr --with-gcc-major-version-only --program-suffix=-9 --program-prefix=x86_64-linux-gnu- --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-vtable-verify --enable-plugin --enable-default-pie --with-system-zlib --with-target-system-zlib=auto --enable-objc-gc=auto --enable-multiarch --disable-werror --with-arch-32=i686 --with-abi=m64 --with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic --enable-offload-targets=nvptx-none=/build/gcc-9-Av3uEd/gcc-9-9.4.0/debian/tmp-nvptx/usr,hsa --without-cuda-driver --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu
Thread model: posix
gcc version 9.4.0 (Ubuntu 9.4.0-1ubuntu1~20.04.1)
gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/9/lto-wrapper
OFFLOAD_TARGET_NAMES=nvptx-none:hsa
OFFLOAD_TARGET_DEFAULT=1
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu 9.4.0-1ubuntu1~20.04.1' --with-bugurl=file:///usr/share/doc/gcc-9/README.Bugs --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,gm2 --prefix=/usr --with-gcc-major-version-only --program-suffix=-9 --program-prefix=x86_64-linux-gnu- --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-vtable-verify --enable-plugin --enable-default-pie --with-system-zlib --with-target-system-zlib=auto --enable-objc-gc=auto --enable-multiarch --disable-werror --with-arch-32=i686 --with-abi=m64 --with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic --enable-offload-targets=nvptx-none=/build/gcc-9-Av3uEd/gcc-9-9.4.0/debian/tmp-nvptx/usr,hsa --without-cuda-driver --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu
Thread model: posix
gcc version 9.4.0 (Ubuntu 9.4.0-1ubuntu1~20.04.1)
locate nvidia.ko
/usr/lib/modules/5.13.0-40-generic/kernel/drivers/usb/typec/altmodes/typec_nvidia.ko
/usr/lib/modules/5.13.0-41-generic/kernel/drivers/usb/typec/altmodes/typec_nvidia.ko
/usr/lib/modules/5.13.0-41-generic/updates/dkms/nvidia.ko
/var/lib/dkms/nvidia/470.129.06/5.13.0-41-generic/x86_64/module/nvidia.ko
Exec format error
usually only happens if there’s something wrong with the build system. cc/gcc is correctly set, though. Did you install/change anything regarding compilers/linkers recently?
Please post the output of
ld -v
Reinstalling the headers might be worth a shot as well
sudo apt install --reinstall linux-headers-$(uname -r)
I didn’t change anything on compilers/linkers - or not that I know of.
ld -v
GNU ld (GNU Binutils for Ubuntu) 2.34
I did almost missed this when executing
sudo apt install --reinstall linux-headers-$(uname -r)
Reading package lists... Done
Building dependency tree
Reading state information... Done
0 upgraded, 0 newly installed, 1 reinstalled, 0 to remove and 1 not upgraded.
Need to get 2.581 kB of archives.
After this operation, 0 B of additional disk space will be used.
Get:1 http://de.archive.ubuntu.com/ubuntu focal-updates/main amd64 linux-headers-5.13.0-41-generic amd64 5.13.0-41.46~20.04.1 [2.581 kB]
Fetched 2.581 kB in 0s (5.665 kB/s)
(Reading database ... 505201 files and directories currently installed.)
Preparing to unpack .../linux-headers-5.13.0-41-generic_5.13.0-41.46~20.04.1_amd64.deb ...
Unpacking linux-headers-5.13.0-41-generic (5.13.0-41.46~20.04.1) over (5.13.0-41.46~20.04.1) ...
Setting up linux-headers-5.13.0-41-generic (5.13.0-41.46~20.04.1) ...
/etc/kernel/header_postinst.d/dkms:
* dkms: running auto installation service for kernel 5.13.0-41-generic
Kernel preparation unnecessary for this kernel. Skipping...
Building module:
cleaning build area...
make -j16 KERNELRELEASE=5.13.0-41-generic KVERSION=5.13.0-41-generic....(bad exit status: 2)
ERROR (dkms apport): binary package for rts_pstor: 1.11 not found
Error! Bad return status for module build on kernel: 5.13.0-41-generic (x86_64)
Consult /var/lib/dkms/rts_pstor/1.11/build/make.log for more information.
...done.
cat /var/lib/dkms/rts_pstor/1.11/build/make.log
DKMS make.log for rts_pstor-1.11 for kernel 5.13.0-41-generic (x86_64)
Fr 20. Mai 20:08:53 CEST 2022
sed "s/RTSX_MK_TIME/`date +%y.%m.%d.%H.%M`/" timestamp.in > timestamp.h
cp -f ./define.release ./define.h
make -C /lib/modules/5.13.0-41-generic/build/ SUBDIRS=/var/lib/dkms/rts_pstor/1.11/build modules
make[1]: warning: jobserver unavailable: using -j1. Add '+' to parent make rule.
make[1]: Entering directory '/usr/src/linux-headers-5.13.0-41-generic'
SYNC include/config/auto.conf.cmd
HOSTCC scripts/basic/fixdep
HOSTCC scripts/kconfig/conf.o
HOSTCC scripts/kconfig/confdata.o
HOSTCC scripts/kconfig/expr.o
LEX scripts/kconfig/lexer.lex.c
YACC scripts/kconfig/parser.tab.[ch]
HOSTCC scripts/kconfig/lexer.lex.o
HOSTCC scripts/kconfig/menu.o
HOSTCC scripts/kconfig/parser.tab.o
HOSTCC scripts/kconfig/preprocess.o
HOSTCC scripts/kconfig/symbol.o
HOSTCC scripts/kconfig/util.o
HOSTLD scripts/kconfig/conf
make[1]: warning: jobserver unavailable: using -j1. Add '+' to parent make rule.
make[2]: *** No rule to make target 'arch/x86/entry/syscalls/syscall_32.tbl', needed by 'arch/x86/include/generated/uapi/asm/unistd_32.h'. Stop.
make[1]: *** [arch/x86/Makefile:231: archheaders] Error 2
make[1]: Leaving directory '/usr/src/linux-headers-5.13.0-41-generic'
make: *** [Makefile:39: default] Error 2
Now I remembered that I installed rts_pstor
in order to get a card reader working and I often got the same error as above:
ERROR (dkms apport): binary package for rts_pstor: 1.11 not found
now I deleted the source folder in /usr/src/rts_pstor-1.11
as well as /var/lib/dkms/rts_pstor/
and now im getting the following error:
sudo apt install --reinstall linux-headers-$(uname -r)
Reading package lists... Done
Building dependency tree
Reading state information... Done
0 upgraded, 0 newly installed, 1 reinstalled, 0 to remove and 1 not upgraded.
1 not fully installed or removed.
After this operation, 0 B of additional disk space will be used.
E: Internal Error, No file name for linux-headers-5.13.0-41-generic:amd64
then I did sudo apt-get upgrade
after that I did:
sudo apt install --reinstall linux-headers-$(uname -r)
Reading package lists... Done
Building dependency tree
Reading state information... Done
0 upgraded, 0 newly installed, 1 reinstalled, 0 to remove and 1 not upgraded.
Need to get 0 B/2.581 kB of archives.
After this operation, 0 B of additional disk space will be used.
(Reading database ... 505201 files and directories currently installed.)
Preparing to unpack .../linux-headers-5.13.0-41-generic_5.13.0-41.46~20.04.1_amd64.deb ...
Unpacking linux-headers-5.13.0-41-generic (5.13.0-41.46~20.04.1) over (5.13.0-41.46~20.04.1) ...
Setting up linux-headers-5.13.0-41-generic (5.13.0-41.46~20.04.1) ...
/etc/kernel/header_postinst.d/dkms:
* dkms: running auto installation service for kernel 5.13.0-41-generic
...done.
will now reboot.
didnt help though.
here is another bug report
nvidia-bug-report.log.gz (151.4 KB)
sudo modprobe nvidia -vv
modprobe: INFO: ../libkmod/libkmod.c:365 kmod_set_log_fn() custom logging function 0x558eef961b90 registered
insmod /lib/modules/5.13.0-41-generic/updates/dkms/nvidia.ko
modprobe: INFO: ../libkmod/libkmod-module.c:892 kmod_module_insert_module() Failed to insert module '/lib/modules/5.13.0-41-generic/updates/dkms/nvidia.ko': Exec format error
modprobe: ERROR: could not insert 'nvidia': Exec format error
modprobe: INFO: ../libkmod/libkmod.c:332 kmod_unref() context 0x558ef0434490 released
Okay it all works now - thank you very much for your help and patience!
After I removed rts_pstor-1.11
I could properly reinstall the linux headers with
sudo apt install --reinstall linux-headers-$(uname -r)
then after a reboot (not sure if necesary) I retook the steps here:
sudo dkms remove nvidia/510.73.05 --all
sudo dkms install nvidia/510.73.05
rebooted and it works now:
nvidia-smi
Fri May 20 22:56:00 2022
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 510.73.05 Driver Version: 510.73.05 CUDA Version: 11.6 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA GeForce ... Off | 00000000:26:00.0 On | N/A |
| 31% 61C P0 45W / 235W | 2009MiB / 8192MiB | 10% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 2232 G /usr/lib/xorg/Xorg 162MiB |
| 0 N/A N/A 3552 G /usr/lib/xorg/Xorg 686MiB |
| 0 N/A N/A 3737 G ...oud-3.3.6-x86_64.AppImage 10MiB |
| 0 N/A N/A 3747 G ...b/thunderbird/thunderbird 63MiB |
| 0 N/A N/A 4310 G ...AAAAAAAAA= --shared-files 82MiB |
| 0 N/A N/A 4317 G ...142885991480565644,131072 48MiB |
| 0 N/A N/A 4677 G ...RendererForSitePerProcess 17MiB |
| 0 N/A N/A 5367 G ...llation/ubuntu12_32/steam 61MiB |
| 0 N/A N/A 5392 G ...token=7843809473786170320 35MiB |
| 0 N/A N/A 5395 G ...oken=10703937685353338314 46MiB |
| 0 N/A N/A 6046 G ...plications/Zoiper5/zoiper 22MiB |
| 0 N/A N/A 6497 G ...ef_log.txt --shared-files 238MiB |
| 0 N/A N/A 7063 G ...132906759813366195,131072 513MiB |
+-----------------------------------------------------------------------------+
I will edit the topic to reflect the main symptom we found
Btw I did delete rts_pstor
by doing:
sudo rm -rf /var/lib/dkms/rts_pstor/
sudo rm -rf /usr/src/rts_pstor-1.11
This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.