Compiling with NVCC in a Docker container under CUDA for WSL

Some context: I have been developing simulations of spiking neural networks in using CUDA/C++ and now I am trying to port them to my universities super computer cluster via docker. I compile and run these simulations on my host machine running windows 10 pro (update 21H2) with a GTX3060ti, CUDA version 11.7 and driver version 516.40.
I have followed this tutorial to completion (despite requiring windows 11, it seems to work with the latest version of windows 10, 21H2). I can run the example docker images (nbody problem) however I run into trouble when I try to compile code using nvcc inside a docker container (11.7.0-devel-ubuntu20.04). I can compile the .cu file with nvcc however when I try to run the executable I get the error:

CUDA driver version is insufficient for CUDA runtime version

nvidia-smi is unavailable in the container, but running nvcc --version gives me the following information:

Built on Tue_May__3_18:49:52_PDT_2022
Cuda compilation tools, release 11.7, V11.7.64
Build cuda_11.7.r11.7/compiler.31294372_0

After much googling I am still at a loss, so any help would be appreciated.
I also still don’t understand which numbers in the above refer to the driver version and which refer to the runtime version.

Hello,
Here is some explanation about all those version numbers :):

  • The CUDA 11.7 is the version of the CUDA API. To give you an analogy it is like OpenGL 4.5 or DirectX 12. It indicates the CUDA feature sets that you have access to.
  • The driver version in your case 516.40, is the Display Driver Version. The CUDA Driver is part of the Display Driver and therefore you have a CUDA Driver installed on your system.
  • Each CUDA Driver is backward compatible and supports up to a certain feature set for instance if a CUDA Driver advertise 11.7 it will support everything from CUDA 1.0 to CUDA 11.7 features

Now for the Runtime:

  • The CUDA Runtime is not part of the display driver, it comes with the toolchain (CUDA Toolkit) that you download and provides a set of features that are build on the top of the CUDA Driver.
  • Unlike the driver the Runtime is a redistributable components that gets redistributed with your application.
  • Each CUDA Runtime requires a driver with a minimum feature set to run and this is checked at startup.

Finally what might be going on your particular case:

  • Considering you are able to run CUDA Apps your setup and driver install is likely right
  • Most likely some driver or library in the container you are using to build might be wrong or taking over the one we provide
  • To help more we would need the following:
    ** In baremetal the output of the nvidia-smi command within wsl (use nvidia-smi not nvidia-smi.exe)
    ** The exact docker command line and the container used to do those builds

Thanks !

1 Like

Hi,

Thank you for taking the time to provide this detailed reply.
Running nvidia-smi within wsl gives me:

Mon Aug  1 18:54:47 2022
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 515.48.07    Driver Version: 516.40       CUDA Version: 11.7     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  On   | 00000000:01:00.0  On |                  N/A |
|  0%   52C    P8    26W / 200W |   1942MiB /  8192MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

To set up the container, and run the code:

docker pull nvidia/cuda:11.7.0-devel-ubuntu20.04
docker run --name cuda_test -it nvidia/cuda:11.7.0-devel-ubuntu20.04
docker cp .\kernel.cu cuda_test:/usr/kernel.cu
nvcc kernel.cu -o ker
./ker

Using Docker version 20.10.17, build 100c701
Container version: nvidia/cuda:11.7.0-devel-ubuntu20.04

Also it might not matter for this container if you are trying to run kernel in that container make sure to have nvida-container-toolkit installed and try running with --gpus all:
https://docs.nvidia.com/ai-enterprise/deployment-guide/dg-docker.html#enabling-the-docker-repository-and-installing-the-nvidia-container-toolkit

1 Like

Thank you, I had come across this guide before but I was confused as it says “for your Linux distribution.”.
Do I run those commands within the docker container or within WSL?

Within WSL:

  • The commands in the guide will setup the nvidia extension for docker to map your GPU in your container
  • Then run docker with “–gpus all”
1 Like

Brilliant, thank you, that was the issue. I didn’t have the container toolkit installed. For anyone else who runs into this issue:
First (not 100% sure if this is necessary but I did it): Go into docker settings and enable integration with your ubuntu-20.04 distro
Open up ubuntu in the WSL terminal and follow the instructions linked above for enabling docker repository and installing the toolkit.