Nvml error: driver/library version mismatch

nida.bijapure · June 21, 2022, 8:40am

Hi,
Getting docker: Error response from daemon: OCI runtime create failed: container_linux.go:370: starting container process caused: process_linux.go:459: container init caused: Running hook #0:: error running hook: exit status 1, stdout: , stderr: nvidia-container-cli: initialization error: nvml error: driver/library version mismatch: unknown. this error while running docker for Cuopt .

Please guide me through this

user162039 · June 22, 2022, 2:37pm

hello @nida.bijapure

This can happen when the kernel is using a different version of the nvidia driver than the client program. You can try this, there may be an error message there

dmesg | grep NVRM

Here are a couple of things to try:

do you get the same error from “nvidia-smi” (assuming it is installed)

nvidia-smi

have you tried simply rebooting? If the driver version has been updated, a reboot is necessary.

Let’s start there.

nida.bijapure · June 28, 2022, 10:50am

Hi,
I didn’t what exactly are you trying to say. Can you please elaborate?

user162039 · June 28, 2022, 2:34pm

Hi @nida.bijapure

These steps may help diagnose the issue. The error can happen when there is a mismatch between a client program or packages on the system and the version of the nvidia driver that is being used by the kernel.

The following command on a Linux system might give us extra information from the system logs:

$ dmesg | grep NVRM

If you have the nvidia-smi executable installed on your system, that might give us a clue too (if nvidia-smi returns a result, but docker has errors, then we know it’s something specific to the docker setup). Run it like this:

$ nvidia-smi

Lastly, if there has been an nvidia driver update, but the system has not been rebooted since the update, rebooting the machine may clear the issue.

nida.bijapure · June 29, 2022, 11:47am

After following the steps you mentioned I got the following error

please guide me through the next steps.

user162039 · June 29, 2022, 2:05pm

@nida.bijapure

okay that gives some clarity. The driver version is older than the installed cuda version, most likely from mixed install methods.

Your best course of action is follow this page from NVIDIA

specifically this section:

Incidentally, if you can create a fresh Ubuntu 22.04 machine, you can use this script to install everything you need for cuOpt. It is super-simple and it works well. Are you able to create a new Ubuntu 22.04 machine?
https://ngc.nvidia.com/resources/ea-reopt-member-zone:setup_ubuntu_for_cuopt

hayley1 · November 11, 2023, 2:23am

Same error here.

 ❯ sudo  dmesg | grep NVRM

[3705011.768121] NVRM: API mismatch: the client has the version 535.129.03, but
                 NVRM: this kernel module has the version 535.104.05.  Please
                 NVRM: make sure that this kernel module and all NVIDIA driver
                 NVRM: components have the same version.

which section and which command should I run to downgrade the client version?

william.harwell · November 14, 2023, 12:27pm

This worked for me.

willemvdkletersteeg · November 15, 2023, 10:59am

Rebooting works, but only temporarily. Even without updating any drivers the system refuses to start new containers after some time. This could be hours, days or weeks but it does happen without any apparent reason. It’s driving our operations team nuts, as a hard reboot (of a production system) is the only option. It would be really valuable if anyone has a suggestion on how to debug this.

Topic		Replies	Views
Failed to initialize NVML: Driver/library version mismatch CUDA Setup and Installation	0	1071	August 3, 2021
API mismatch with 32-bit cuda 1.0 CUDA Programming and Performance	4	22973	June 29, 2007
Failed to initialize NVML: Driver/library version mismatch CUDA Setup and Installation	5	34461	March 9, 2021
nvidia-smi: Failed to initialize NVML: Driver/library version mismatch Linux	3	17323	October 14, 2021
Driver/library version mismatch after updating drivers to version 375.26 CUDA Setup and Installation	3	17107	May 18, 2017
NVRM: API mismatch CUDA Setup and Installation	2	31943	September 14, 2015
Running Cuda on Docker CUDA Setup and Installation	7	17408	May 23, 2016
NVML driver/library mismatch after libnvidia update CUDA Setup and Installation kernel	0	695	June 8, 2020
initialize nvml: driver/library version mismatch GPU - Hardware	0	782	January 9, 2020
CUDA and CUDART versions? CUDA Programming and Performance	4	21558	April 9, 2010

Nvml error: driver/library version mismatch

Related topics