Docker --gpus all but nvidia-smi indicated me NVIDIA-SMI couldn't find libnvidia-ml.so library in your system. Please make sure that the NVIDIA Displ

1261200127 · March 6, 2022, 4:46am

Hey fellas , OS is centOS7.9 I was deploying a pytorch docker on it.
while executing nvidia-smi,I got the correct result.
BUT,when I use "docker run -it --gpus all
In the docker i executed nvidia-smi,It indicated me like this:
NVIDIA-SMI couldn’t find libnvidia-ml.so library in your system. Please make sure that the NVIDIA Display Driver is properly installed and present in your system.
Please also try adding directory that contains libnvidia-ml.so to your system PATH.

I was using tesla v100

1261200127 · March 6, 2022, 4:52am

This is my cuda valiadation.
/deviceQuery Starting…

CUDA Device Query (Runtime API) version (CUDART static linking)

Detected 1 CUDA Capable device(s)

Device 0: “Tesla V100-SXM2-16GB”
CUDA Driver Version / Runtime Version 11.2 / 11.0
CUDA Capability Major/Minor version number: 7.0
Total amount of global memory: 16160 MBytes (16945512448 bytes)
(80) Multiprocessors, ( 64) CUDA Cores/MP: 5120 CUDA Cores
GPU Max Clock rate: 1530 MHz (1.53 GHz)
Memory Clock rate: 877 Mhz
Memory Bus Width: 4096-bit
L2 Cache Size: 6291456 bytes
Maximum Texture Dimension Size (x,y,z) 1D=(131072), 2D=(131072, 65536), 3D=(16384, 16384, 16384)
Maximum Layered 1D Texture Size, (num) layers 1D=(32768), 2048 layers
Maximum Layered 2D Texture Size, (num) layers 2D=(32768, 32768), 2048 layers
Total amount of constant memory: 65536 bytes
Total amount of shared memory per block: 49152 bytes
Total number of registers available per block: 65536
Warp size: 32
Maximum number of threads per multiprocessor: 2048
Maximum number of threads per block: 1024
Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
Max dimension size of a grid size (x,y,z): (2147483647, 65535, 65535)
Maximum memory pitch: 2147483647 bytes
Texture alignment: 512 bytes
Concurrent copy and kernel execution: Yes with 2 copy engine(s)
Run time limit on kernels: No
Integrated GPU sharing Host Memory: No
Support host page-locked memory mapping: Yes

pathmunge /usr/sbin after

Alignment requirement for Surfaces: Yes

pathmunge /usr/sbin after

Device has ECC support: Enabled

pathmunge /usr/sbin after

Device supports Unified Addressing (UVA): Yes
Device supports Compute Preemption: Yes
Supports Cooperative Kernel Launch: Yes
Supports MultiDevice Co-op Kernel Launch: Yes
Device PCI Domain ID / Bus ID / location ID: 0 / 0 / 7
Compute Mode:
< Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 11.2, CUDA Runtime Version = 11.0, NumDevs = 1, Device0 = Tesla V100-SXM2-16GB
Result = PASS

The fact is that I had changed the torch version in docker image,and reinstalled the cuda.
or please tell where to find the libnvidia-ml.so

achartiernv · March 7, 2022, 4:08pm

Hello, this forum is dedicated to discussions related to using the sanitizer tools and API.
Questions related to CUDA can be raised at CUDA - NVIDIA Developer Forums

Topic		Replies	Views
Cannot access CUDA Driver in docker whitout --privileged Docker and NVIDIA Docker	0	1279	August 5, 2021
Ubuntu 16.4 cuda 10.1 GV100GL fail Compute Sanitizer cuda , ubuntu	2	822	September 27, 2021
nvidia-smi not found on Drive PX2 General	2	2049	November 10, 2017
Docker and nvidia-smi not working with clean install on Driver 470.14 and Insider Preview (Build 21343) Ubuntu 20.04 CUDA on Windows Subsystem for Linux	3	5597	April 17, 2021
NVIDIA Driver is not loaded after update from CUDA Toolkit 10.1 to 11.3 Linux cuda , ubuntu	7	3536	October 12, 2021
Please when I use docker on wsl2，prompts me libnvidia-m1.so.1 not found CUDA Setup and Installation boot , cuda	0	373	August 10, 2023
`nvidia-smi` command not found in Docker Container CUDA on Windows Subsystem for Linux	3	20049	July 3, 2021
Nvidia-smi can't communicate with driver -- docker-desktop conflict? CUDA on Windows Subsystem for Linux cuda , wsl	3	2565	April 10, 2023
Applications not using GPU inside docker container Docker and NVIDIA Docker	1	1191	May 2, 2024
nvidia-smi reports 3 GPUs but deviceQuery reports only 2 CUDA Setup and Installation	4	2011	June 23, 2018

Docker --gpus all but nvidia-smi indicated me NVIDIA-SMI couldn't find libnvidia-ml.so library in your system. Please make sure that the NVIDIA Displ

Related topics