Compiling with NVCC in a Docker container under CUDA for WSL

cathal.mc-loughlin · August 1, 2022, 10:03am

Some context: I have been developing simulations of spiking neural networks in using CUDA/C++ and now I am trying to port them to my universities super computer cluster via docker. I compile and run these simulations on my host machine running windows 10 pro (update 21H2) with a GTX3060ti, CUDA version 11.7 and driver version 516.40.
I have followed this tutorial to completion (despite requiring windows 11, it seems to work with the latest version of windows 10, 21H2). I can run the example docker images (nbody problem) however I run into trouble when I try to compile code using nvcc inside a docker container (11.7.0-devel-ubuntu20.04). I can compile the .cu file with nvcc however when I try to run the executable I get the error:

CUDA driver version is insufficient for CUDA runtime version

nvidia-smi is unavailable in the container, but running nvcc --version gives me the following information:

Built on Tue_May__3_18:49:52_PDT_2022
Cuda compilation tools, release 11.7, V11.7.64
Build cuda_11.7.r11.7/compiler.31294372_0

After much googling I am still at a loss, so any help would be appreciated.
I also still don’t understand which numbers in the above refer to the driver version and which refer to the runtime version.

rboissel · August 1, 2022, 4:24pm

Hello,
Here is some explanation about all those version numbers :):

The CUDA 11.7 is the version of the CUDA API. To give you an analogy it is like OpenGL 4.5 or DirectX 12. It indicates the CUDA feature sets that you have access to.
The driver version in your case 516.40, is the Display Driver Version. The CUDA Driver is part of the Display Driver and therefore you have a CUDA Driver installed on your system.
Each CUDA Driver is backward compatible and supports up to a certain feature set for instance if a CUDA Driver advertise 11.7 it will support everything from CUDA 1.0 to CUDA 11.7 features

Now for the Runtime:

The CUDA Runtime is not part of the display driver, it comes with the toolchain (CUDA Toolkit) that you download and provides a set of features that are build on the top of the CUDA Driver.
Unlike the driver the Runtime is a redistributable components that gets redistributed with your application.
Each CUDA Runtime requires a driver with a minimum feature set to run and this is checked at startup.

Finally what might be going on your particular case:

Considering you are able to run CUDA Apps your setup and driver install is likely right
Most likely some driver or library in the container you are using to build might be wrong or taking over the one we provide
To help more we would need the following:
** In baremetal the output of the nvidia-smi command within wsl (use nvidia-smi not nvidia-smi.exe)
** The exact docker command line and the container used to do those builds

Thanks !

cathal.mc-loughlin · August 1, 2022, 6:09pm

Hi,

Thank you for taking the time to provide this detailed reply.
Running nvidia-smi within wsl gives me:

Mon Aug  1 18:54:47 2022
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 515.48.07    Driver Version: 516.40       CUDA Version: 11.7     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  On   | 00000000:01:00.0  On |                  N/A |
|  0%   52C    P8    26W / 200W |   1942MiB /  8192MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

To set up the container, and run the code:

docker pull nvidia/cuda:11.7.0-devel-ubuntu20.04
docker run --name cuda_test -it nvidia/cuda:11.7.0-devel-ubuntu20.04
docker cp .\kernel.cu cuda_test:/usr/kernel.cu
nvcc kernel.cu -o ker
./ker

Using Docker version 20.10.17, build 100c701
Container version: nvidia/cuda:11.7.0-devel-ubuntu20.04

rboissel · August 3, 2022, 8:17pm

Also it might not matter for this container if you are trying to run kernel in that container make sure to have nvida-container-toolkit installed and try running with --gpus all:
https://docs.nvidia.com/ai-enterprise/deployment-guide/dg-docker.html#enabling-the-docker-repository-and-installing-the-nvidia-container-toolkit

cathal.mc-loughlin · August 3, 2022, 8:26pm

Thank you, I had come across this guide before but I was confused as it says “for your Linux distribution.”.
Do I run those commands within the docker container or within WSL?

rboissel · August 4, 2022, 12:51am

Within WSL:

The commands in the guide will setup the nvidia extension for docker to map your GPU in your container
Then run docker with “–gpus all”

cathal.mc-loughlin · August 5, 2022, 8:24pm

Brilliant, thank you, that was the issue. I didn’t have the container toolkit installed. For anyone else who runs into this issue:
First (not 100% sure if this is necessary but I did it): Go into docker settings and enable integration with your ubuntu-20.04 distro
Open up ubuntu in the WSL terminal and follow the instructions linked above for enabling docker repository and installing the toolkit.

system · August 19, 2022, 8:25pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Docker and nvidia-smi not working with clean install on Driver 470.14 and Insider Preview (Build 21343) Ubuntu 20.04 CUDA on Windows Subsystem for Linux	3	5620	April 17, 2021
WSL2 CUDA Driver 465.42 not working with Nvidia's CUDA 11.1.1 Docker Containers CUDA on Windows Subsystem for Linux	3	5419	October 12, 2021
`nvidia-smi` command not found in Docker Container CUDA on Windows Subsystem for Linux	3	20397	July 3, 2021
Not able to run containers under CUDA on WSL 2 CUDA on Windows Subsystem for Linux	0	692	May 14, 2021
Yet another "Driver Not Loaded / can't communicate with the NVIDIA driver" error while trying to deploy a docker container with GPU support on WSL2 CUDA on Windows Subsystem for Linux	11	5574	May 9, 2021
Hiccups setting up WSL2 + CUDA CUDA on Windows Subsystem for Linux	19	9918	October 12, 2021
Wsl2 Ubuntu , docker is not running CUDA on Windows Subsystem for Linux	12	3737	June 25, 2021
How can I run a container from nvidia/cuda:12.0.1-cudnn8-runtime-ubuntu22.04 using `--gpus` option? CUDA on Windows Subsystem for Linux	2	10670	February 27, 2024
CUDA in WSL2 on Win10 1909 issues CUDA on Windows Subsystem for Linux	1	4906	September 4, 2020
PyCUDA using CUDA + WSL + Docker CUDA on Windows Subsystem for Linux cuda , pycuda , wsl	3	2700	July 16, 2021

Compiling with NVCC in a Docker container under CUDA for WSL

Related topics