PyCUDA using CUDA + WSL + Docker

I have been trying to run a program that uses PyCUDA and Pygame in the following environment:

OS: Windows 11 Pro
Build: 22000.co_release.210604-1628
GPU: NVIDIA Quadro RTX 4000
NVIDIA driver: 470.76
CUDA Version: 11.4
Linux Distribution: Ubuntu 20.04
WSL System Info: Linux DESKTOP-HCOKLQU 5.10.43.3-microsoft-standard-WSL2 #1 SMP Wed Jun 16 23:47:55 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
CUDA Toolkit: cuda-toolkit-11-4
Docker version: 20.10.7

I followed the “CUDA on WSL User Guide” carefully, and I passed the N-body simulation test. When I try to run my program inside a docker container, I get the following error:

pygame 2.0.1 (SDL 2.0.14, Python 3.8.3)
Hello from the pygame community. Contribute - pygame wiki


Geant4 version Name: geant4-10-05-patch-01 (17-April-2019)
Copyright : Geant4 Collaboration
References : NIM A 506 (2003), 250-303
: IEEE-TNS 53 (2006), 270-278
: NIM A 835 (2016), 186-225
WWW : http://geant4.org/


Visualization Manager instantiating with verbosity “warnings (3)”…

Number of photons: 1000
Detector used: sipm
View setup before simulation: enabled
View plots: enabled
Analyze simulation: enabled
Seed=0

Start Simulation Done

Creating Blank LXe Setup: importing LXE cell Done
146.30466947534526

Mesh Imports Successful

Fatal Python error: (pygame parachute) Segmentation Fault
Python runtime state: initialized

Current thread 0x00007f55a3e92740 (most recent call first):
File “/opt/anaconda3/lib/python3.8/site-packages/pycuda-2021.1-py3.8-linux-x86_64.egg/pycuda/tools.py”, line 184 in make_default_context
File “/opt/chroma/chroma/gpu/tools.py”, line 126 in create_cuda_context
File “/opt/chroma/chroma/loader.py”, line 150 in load_bvh
File “/opt/chroma/chroma/loader.py”, line 185 in create_geometry_from_obj
File “/opt/chroma/chroma/camera.py”, line 1013 in view
File “NewXenon_cell_simulation_13_7_18.py”, line 46 in
Aborted

Looking into the PyCUDA source code, the failure occurs from the following call:
pycuda.driver.Device.count()

The docker container was built successfully in the WSL environment. I have tested this on a Linux OS and was successful. The docker image is built off “nvidia/cudagl:11.3.0-devel-ubuntu20.04” which seems to use its own CUDA 11.3 libraries, but “nvidia-smi” does not work inside the docker container so it does not have its own NVIDIA driver. Do I need to make changes to the image?

Any help would be great!

NVIDIA driver: 470.76

First of all try to install latest official Nvidia Windows drivers 471.11 or 471.21 preview drivers from CUDA on WSL | NVIDIA Developer

For nvidia-smi inside docker see `nvidia-smi` command not found in Docker Container - #5 by onomatopellan

I have updated the NVIDIA driver to 471.21. There is no change in the error.

Edit: typo

Hello,

First, make sure you are using: --gpus 'all,"capabilities=compute,utility"' on your docker command line especially if you built the container image yourself (you can see how to embed that in your container image here: User Guide — NVIDIA Cloud Native Technologies documentation)

Second, so far we don’t map nvidia-smi by default right now on WSL in the container but we have a merge request to fix that very soon :): WSL - Add code to map the binaries from the driver store if they are... (!72) · Merge requests · nvidia / container-toolkit / libnvidia-container · GitLab

Let us know if that helps,

Thanks !