Nvidia-container-cli: detection error: nvml error: function not found: unknown

Hi, i have a nvidia grid k2 gpu, and i was recently about to install nvidia-container-toolkit on my ubuntu16.04. the process of installing was successful, but when i run the command ‘docker run --gpus all --rm debian:10-slim nvidia-smi’, a errro has occurred ‘docker: Error response from daemon: OCI runtime create failed: container_linux.go:367: starting container process caused: process_linux.go:495: container init caused: Running hook #0:: error running hook: exit status 1, stdout: , stderr: nvidia-container-cli: detection error: nvml error: function not found: unknown.’. the information is follow:

info
docker run --gpus all --rm debian:10-slim nvidia-smi
Unable to find image ‘debian:10-slim’ locally
10-slim: Pulling from library/debian
f7ec5a41d630: Pull complete
Digest: sha256:b586cf8c850cada85a47599f08eb34ede4a7c473551fd7c68cbf20ce5f8dbbf1
Status: Downloaded newer image for debian:10-slim
docker: Error response from daemon: OCI runtime create failed: container_linux.go:367: starting containvidia-container-cli: detection error: nvml error: function not found: unknown.

nvidia-smi
Wed Apr 21 11:26:56 2021
±----------------------------------------------------------------------------+
| NVIDIA-SMI 367.134 Driver Version: 367.134 |
|-------------------------------±---------------------±---------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GRID K2 Off | 0000:1A:00.0 Off | Off |
| N/A 46C P0 44W / 117W | 0MiB / 4033MiB | 0% Default |
±------------------------------±---------------------±---------------------+
| 1 GRID K2 Off | 0000:1B:00.0 Off | Off |
| N/A 42C P0 42W / 117W | 0MiB / 4033MiB | 0% Default |
±------------------------------±---------------------±---------------------+
| 2 GRID K2 Off | 0000:B1:00.0 Off | Off |
| N/A 46C P0 45W / 117W | 0MiB / 4033MiB | 0% Default |
±------------------------------±---------------------±---------------------+
| 3 GRID K2 Off | 0000:B2:00.0 Off | Off |
| N/A 42C P0 39W / 117W | 0MiB / 4033MiB | 1% Default |
±------------------------------±---------------------±---------------------+

±----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| No running processes found |
±----------------------------------------------------------------------------+

nvidia-container-cli -V
version: 1.3.3
build date: 2021-02-05T13:29+00:00
build revision: bd9fc3f2b642345301cb2e23de07ec5386232317
build compiler: gcc-5 5.4.0 20160609
build platform: x86_64
build flags: -D_GNU_SOURCE -D_FORTIFY_SOURCE=2 -DNDEBUG -std=gnu11 -O2 -g -fdata-sections -ffunction-sections -fstack-protector -fno-strict-aliasing -fvisibility=hidden -Wall -Wextra -Wcast-align -Wpointer-arith -Wmissing-prototypes -Wnonnull -Wwrite-strings -Wlogical-op -Wformat=2 -Wmissing-format-attribute -Winit-self -Wshadow -Wstrict-prototypes -Wunreachable-code -Wconversion -Wsign-conversion -Wno-unknown-warning-option -Wno-format-extra-args -Wno-gnu-alignof-expression -Wl,-zrelro -Wl,-znow -Wl,-zdefs -Wl,–gc-sections

uname -a
Linux xidian-S2600WFT 4.15.0-142-generic #146~16.04.1-Ubuntu SMP Tue Apr 13 09:27:15 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

nvidia-container-cli -k -d /dev/tty info

– WARNING, the following logs are for debugging purposes only –

I0421 03:29:14.518588 15027 nvc.c:372] initializing library context (version=1.3.3, build=bd9fc3f2b642345301cb2e23de07ec5386232317)
I0421 03:29:14.518653 15027 nvc.c:346] using root /
I0421 03:29:14.518667 15027 nvc.c:347] using ldcache /etc/ld.so.cache
I0421 03:29:14.518678 15027 nvc.c:348] using unprivileged user 1000:1000
I0421 03:29:14.518740 15027 nvc.c:389] attempting to load dxcore to see if we are running under Windows Subsystem for Linux (WSL)
I0421 03:29:14.518880 15027 nvc.c:391] dxcore initialization failed, continuing assuming a non-WSL environment
W0421 03:29:14.528026 15028 nvc.c:269] failed to set inheritable capabilities
W0421 03:29:14.528133 15028 nvc.c:270] skipping kernel modules load due to failure
I0421 03:29:14.528714 15029 driver.c:101] starting driver service
I0421 03:29:16.399164 15027 nvc_info.c:680] requesting driver information with ‘’
nvidia-container-cli: detection error: nvml error: function not found
I0421 03:29:16.399574 15027 nvc.c:427] shutting down library context
I0421 03:29:16.882944 15029 driver.c:156] terminating driver service
I0421 03:29:16.883197 15027 driver.c:196] driver service terminated successfully

docker --version
Docker version 20.10.6, build 370c289

i don’t know how to slove it, can anybody help me?

Try updating your GPU driver. Yes, I understand the 367.xx driver is the latest one offered. But try installing a newer driver from a CUDA toolkit installer. For example install the driver from the CUDA 9.0 or 10.0 CUDA toolkit installer.

thanks for your reply! i will try it later.

Hi, i used the 10.0 CUDA toolkit installer to install the newer driver, i use the ubuntu18.04 now, but i met a problem. The information is shown below:

uname -r
Linux xidian-S2600WFT 4.15.0-128-generic #131-Ubuntu SMP Wed Dec 9 06:57:35 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux

gcc --version
gcc (GCC) 7.3.0
Copyright (C) 2017 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

the error in /tmp/cuda_install_15685.log:
ERROR: Unable to load the ‘nvidia-drm’ kernel module.
ERROR: Installation has failed. Please see the file
‘/var/log/nvidia-installer.log’ for details. You may find
suggestions on fixing installation problems in the README available
on the Linux driver download page at www.nvidia.com.

the original kernel of my server is 5.4.0, but i see the table in the NVIDIA CUDA INSTALLATION GUIDE FOR LINUX is

, so i changed the kernel version and the gcc version, i don’t know if that’s right,please help me.

After changing the kernel version and gcc version to match what is expected, you would need to try the installation again, if it still fails, then I don’t have any further ideas.

support for the GRID cards was phased out some time ago. That is why the last officially supported driver stops at R367. I was hoping that a newer driver might work (I don’t have a GRID K2 card to test on) but apparently not.

I don’t have any solution for you, it seems to me that the card is too old or obsolete to be used for the purpose you are trying to use it for. In any event you are welcome to file a bug, however they are likely to direct you to the last supported driver (R367) so it may be necessary for you to go back and describe your issue based on that driver.

ok, thanks for your reply again! I will try my best to try the installer again.