"driver/library version mismatch: unknown"

OS: Ubuntu 20.04.1 LTS
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2022 NVIDIA Corporation
Built on Wed_Sep_21_10:33:58_PDT_2022
Cuda compilation tools, release 11.8, V11.8.89
Build cuda_11.8.r11.8/compiler.31833905_0
GPUs: 4 * 4090
Last week, Docker was able to build without any issues. However, after two days, an error occurred stating “driver/library version mismatch: unknown”. Although many people online suggest that restarting can solve the problem, we need our application to be stable. Therefore, I would like to know what exactly happened and how to completely avoid such incidents.
Here’s more information:

Attaching to test-lora-sd-webui1, train-lora1
Error response from daemon: failed to create task for container: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error running hook #0: error running hook: exit status 1, stdout: , stderr: Auto-detected mode as 'legacy'
nvidia-container-cli: initialization error: nvml error: driver/library version mismatch: unknown

Additionally, here’s nvidia-bug-report.sh output:
nvidia-bug-report.log.gz (1.3 MB)

No to be hijacking your topic, but I’d like to let you know I’m tracking this because I have the exact same issue.

Our server is happily running several Deepstream containers day and night. Starting containers, stopping containers, some run for weeks others for minutes. All seems fine but sometimes it’s suddenly impossible to start new containers and Docker gives this exact error you describe.

All we can do to fix it is a hard reboot. Which takes all our running analysis containers offline for a few minutes.

Sometimes we have to do this on a daily basis, sometimes we don’t have to for weeks. It’s a really strange problem. I’d like to know how to further debug this issue if anyone knows.

1 Like

So this question is important, and everyone is eager to know the underlying reasons not just reboot your device.

1 Like