Running Cuda on Docker

ekasatria · November 3, 2015, 7:10am

Hi everybody,

i’m having some problem here. I want to access the gpu from inside the docker. I have followed the tutorial on Docker on AWS GPU Ubuntu 14.04 / CUDA 6.5 - Seven Story Rabbit Hole but no luck.
the situation now is i have installed cuda on host, nvidia driver 352.55 with cuda 5.5 and devicequerry result is pass.

following the tutorial, the result from this command inside the host

ls -la /dev | grep nvidia

is

crw-rw-rw-  1 root root 250,   0 Nov  3 06:55 nvidia-uvm
crw-rw-rw-  1 root root 195,   0 Nov  3 06:55 nvidia0
crw-rw-rw-  1 root root 195, 255 Nov  3 06:55 nvidiactl

the result from

cat /proc/driver/nvidia/version

show the same result within the host

NVRM version: NVIDIA UNIX x86_64 Kernel Module  352.55  Thu Oct  8 15:18:00 PDT 2015

but when running device query it’s failed with this result

-> CUDA driver version is insufficient for CUDA runtime version
Result = FAIL

when i try to install the driver inside the host its showing error

An NVIDIA kernel module 'nvidia-uvm' appears to already be loaded in your kernel.  This may be because it is in use (for example, by an X server, a CUDA program, or the NVIDIA Persistence Daemon), but 
         this may also happen if your kernel was configured without support for module unloading.  Please be sure to exit any programs that may be using the GPU(s) before attempting to upgrade your driver.  If 
         no GPU-based programs are running, you know that your kernel supports module unloading, and you still receive this message, then an error may have occured that has corrupted an NVIDIA kernel module's  
         usage count, for which the simplest remedy is to reboot your computer.

i dont know what to do anymore, can someone help me ?

inJeans · November 4, 2015, 11:17pm

Your Docker image drivers must match your host drivers. You say that you have nvidia driver 352.55 installed on the host, but the docker image ([url]Docker Hub) used in that tutorial has nvidia driver 340.29

So you either need to install the corresponding driver on your host, or try a different docker image with the same driver.

flx42 · November 5, 2015, 3:00am

If you can update to CUDA 7.0, we now have a solution here:

With our approach there won’t be a mismatch problem between driver versions, simply because we don’t install the driver inside the image (only the toolkit) and the driver files are mounted when starting the container.

pkrull · November 12, 2015, 8:48pm

I’m using nvidia-docker and still seeing the same error, “CUDA driver version is insufficient for CUDA runtime version”.

I’m using a g2 instance on AWS w/ Ubuntu 14.04. I installed CUDA 7.5 on the host following the instructions on http://docs.nvidia.com/cuda/cuda-getting-started-guide-for-linux/index.html#pre-installation-actions using cuda-repo-ubuntu1404_7.5-18_amd64.deb. I’ve run deviceQuery on the host and see the nvidia devices under /dev.

On the host:

ubuntu@ip-172-31-18-221:~/src/nvidia-docker$ cat /proc/driver/nvidia/version
NVRM version: NVIDIA UNIX x86_64 Kernel Module  352.39  Fri Aug 14 18:09:10 PDT 2015
GCC version:  gcc version 4.8.4 (Ubuntu 4.8.4-2ubuntu1~14.04)

I then ran,

sudo docker build -t cuda ubuntu-14.04/cuda/7.5

sudo docker build -t dq samples/deviceQuery/

ubuntu@ip-172-31-18-221:~$ sudo docker run --privileged dq
./deviceQuery Starting...

 CUDA Device Query (Runtime API) version (CUDART static linking)

cudaGetDeviceCount returned 35
-> CUDA driver version is insufficient for CUDA runtime version
Result = FAIL

Am I using nvidia-docker incorrectly? Should I be installing CUDA on the host differently?

Thanks for any help.

flx42 · November 18, 2015, 2:01am

You need to use the nvidia-docker wrapper instead of just “docker”.
If you have to use “sudo docker” (i.e. the user is not part of the “docker” group), then you should fetch the latest code and try something like that:
DOCKER=‘sudo docker’ ./nvidia-docker …

oferbh · May 22, 2016, 4:48pm

Hello,
I’m using AWS g2 machine with CUDA 7.0, and trying to run a docker that is based on: nvidia/cuda:7.0-cudnn4-runtime-ubuntu14.04.
deviceQuery runs successfully on the host with the following message:
deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 7.5, CUDA Runtime Version = 7.0, NumDevs = 1, Device0 = GRID K520
Result = PASS

I’ve tried to use the script nvidia-docker, and it creates successfully the machine with the message:
[ NVIDIA ] =INFO= Driver version: 352.93
[ NVIDIA ] =INFO= CUDA image version: 7.0

but when running deviceQuery, I receive the following error:

cudaGetDeviceCount returned 38
→ no CUDA-capable device is detected

Thanks for your help.

flx42 · May 23, 2016, 4:15am

You are running an old version of nvidia-docker, please follow the instructions on the GitHub page to install the latest version:

oferbh · May 23, 2016, 2:34pm

Thanks, it solved the problem.

Topic		Replies	Views
could not select device driver "" with capabilities: [[gpu]]. Docker and NVIDIA Docker	11	193700	August 12, 2024
command "docker run --runtime=nvidia --rm nvidia/cuda:9.0-base nvidia-smi" fails with Error CUDA Setup and Installation	1	9997	January 16, 2019
Cuda 11.4.2 docker image driver version mismatch CUDA Setup and Installation	2	4263	January 7, 2022
Getting cudaRuntimeGetVersion() failed with error #35 for CUDA Version 7.5.18 with 361.42 driver CUDA Setup and Installation	4	5094	September 6, 2016
nvidia-docker seems unable to use GPU as non-root user Jetson TX2	8	9026	October 18, 2021
Unable to install cuda (11.3) inside Ubuntu docker environment CUDA Setup and Installation cuda , ubuntu , docker , linux	3	9270	October 12, 2021
Nvidia Driver 390.87 + CUDA, Ubuntu 16.04 docker container, Python3. Host RHEL7.5 Container: CUDA	2	4279	October 12, 2021
Errors running deepstream-bodypose-3d in docker DeepStream SDK	6	376	September 18, 2023
Trouble running docker on Xavier Jetson AGX Xavier cuda , docker	4	763	August 23, 2023
Using CUDA in l4t-cuda Docker container Jetson TX2 cuda , ubuntu , docker	3	1577	May 4, 2022

Running Cuda on Docker

Related topics