Stderr: nvidia-container-cli: initialization error: driver error: failed to process request\\\\n\\\"\"": unknown

Thank you kmorozov, by updating all the optional update, I make it work. I should pay extra attention to the instrution about the kernel version initially.

And one more question, I see somewhere saying there will be performance hurt using GPU on WSL2? If there is a hurt, I can install a Linux dual system alternatively. And Use WSL for prototyping.

Some performance issues are known in WSL2 due to the GPU paravirtualization used to deliver the GPU hardware inside the WSL2 container (please see the NVIDIA CUDA WSL blog for more details). However, the actual numbers would really depend on the workload. Generally for GPU bound applications the performance difference with native runs is expected to be significantly lower. There are some corner cases, of course, that need to be looked on case by case basis.

I’m having similar issues. Started over with a fresh Ubuntu 20.04 WSL 2 installation:
uname -r
4.19.128-microsoft-standard

Installed the Correct NVIDIA-SMI driver 460.20

±----------------------------------------------------------------------------+
| NVIDIA-SMI 460.20 Driver Version: 460.20 CUDA Version: 11.2 |
|-------------------------------±---------------------±---------------------+
| GPU Name TCC/WDDM | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 GeForce RTX 208… WDDM | 00000000:53:00.0 On | N/A |
| 24% 55C P3 65W / 260W | 2506MiB / 11264MiB | 17% Default |
| | | N/A |
±------------------------------±---------------------±---------------------+

Was able to build the BlackScholes example without errors. When I try to rut it:

[./BlackScholes] - Starting…
CUDA error at …/…/common/inc/helper_cuda.h:777 code=35(cudaErrorInsufficientDriver) “cudaGetDeviceCount(&device_count)”

I haven’t tried with the container(s) yet. But before the Ubuntu WSL2 re-install I was getting a similar error so I’m assuming until I can fix this that will still be an issue.

After installing the docker, and the nvidia containers per the instructions. Trying to run:

docker run --gpus all nvcr.io/nvidia/k8s/cuda-sample:nbody nbody -gpu -benchmark
results in:
stderr: nvidia-container-cli: initialization error: driver error: failed to process request\\n\""": unknown.

@seamans What it shows if you type ls -la /dev/dxg from bash?
What’s your Windows build? (run winver.exe)

The windows build is: Version 2004(OS Build 19041.508)
/dev/dgx is missing, even after re-installing the Nvida recommended driver.

That’s the problem. You need a Windows build from the Insiders Dev/Fast channel (like build 20221) in order to use CUDA in WSL2. As stated here:

Apparently, the issue is that you need WDDM 2.9 which is only currently available on the insider builds. So even though I have kernel 4.19.128-microsoft-standard, which exceeds the kernel requirements and the NVIDIA 460.20 driver I still won’t be able to get the CUDA on WSL 2 working until I can upgrade the WDDM 2.7 driver to WDDM 2.9. (I used dxdiag.exe to check the WDDM version).

Oh, well, I really only use Windows for gaming so I’ll continue to do my GPU development on my Ubuntu system (Dual boot) until that update is generally available since I have no interest in joining the M$ insider program.

Thanks for the reply.

1 Like

I am also able to run CUDA samples under WSL2 successfully. But running docker container gives same error. I was following all the steps given at [CUDA on WSL :: CUDA Toolkit Documentation (nvidia.com)]

From deb entries I could see nvidia-docker-runtime is from stable repo and only libnvidia-container is from experimental repo is that expected? Or both should be from experimental repo?

I do have all respective files and folders mentioned in this post and CUDA example runs fine under my WSL 2, Windows insider and Linux Kernals are also latest (Windows 10 - 21376.1, Linux version 5.4.72-microsoft-standard-WSL2 ) NVIDIA Driver is Driver Version: 470.14 ,CUDA Version I installed on WSL : 11.2.

$ sudo nvidia-container-cli -k -d /dev/tty info

– WARNING, the following logs are for debugging purposes only –

I0511 13:06:46.372112 19307 nvc.c:372] initializing library context (version=1.4.0, build=704a698b7a0ceec07a48e56c37365c741718c2df)
I0511 13:06:46.372174 19307 nvc.c:346] using root /
I0511 13:06:46.372266 19307 nvc.c:347] using ldcache /etc/ld.so.cache
I0511 13:06:46.372284 19307 nvc.c:348] using unprivileged user 65534:65534
I0511 13:06:46.372310 19307 nvc.c:389] attempting to load dxcore to see if we are running under Windows Subsystem for Linux (WSL)
I0511 13:06:46.407769 19307 dxcore.c:226] Creating a new WDDM Adapter for hAdapter:40000000 luid:f7a568
I0511 13:06:46.424907 19307 dxcore.c:267] Adding new adapter via dxcore hAdapter:40000000 luid:f7a568 wddm version:3000
I0511 13:06:46.424976 19307 dxcore.c:325] dxcore layer initialized successfully
W0511 13:06:46.425554 19307 nvc.c:397] skipping kernel modules load on WSL
I0511 13:06:46.425753 19308 driver.c:101] starting driver service
E0511 13:06:46.438484 19308 driver.c:168] could not start driver service: load library failed: /usr/lib/wsl/drivers/nvmdi.inf_amd64_c883b852a1685351/libnvidia-ml.so.1: undefined symbol: devicesetgpcclkvfoffset
I0511 13:06:46.438646 19307 driver.c:203] driver service terminated successfully
nvidia-container-cli: initialization error: driver error: failed to process request

What else could be a problem?

Hello kmorozov! I have the similar problem
"
docker: Error response from daemon: OCI runtime create failed: container_linux.go:367: starting container process caused: process_linux.go:495: container init caused: Running hook #0:: error running hook: exit status 1, stdout: , stderr: nvidia-container-cli: initialization error: driver error: failed to process request: unknown.
ERRO[0546] error waiting for container: context canceled
"
5.4.72-microsoft-standard-WSL2
Ubuntu 18.04.5 LTS
cuda 11
How can I fix it?

Hi. I get a error but not same with all above.

uname -a
Linux esi09 4.19.128-microsoft-standard #1 SMP Tue Jun 23 12:58:10 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux

the log info is:
– WARNING, the following logs are for debugging purposes only –

I0511 07:18:58.774674 3144 nvc.c:372] initializing library context (version=1.4.0, build=704a698b7a0ceec07a48e56c37365c741718c2df)
I0511 07:18:58.774726 3144 nvc.c:346] using root /
I0511 07:18:58.774731 3144 nvc.c:347] using ldcache /etc/ld.so.cache
I0511 07:18:58.774735 3144 nvc.c:348] using unprivileged user 65534:65534
I0511 07:18:58.774746 3144 nvc.c:389] attempting to load dxcore to see if we are running under Windows Subsystem for Linux (WSL)
I0511 07:18:58.794037 3144 dxcore.c:226] Creating a new WDDM Adapter for hAdapter:40000000 luid:22b686
E0511 07:18:58.794111 3144 dxcore.c:234] Found a WDDM adapter running a driver with pre-WDDM 2.8 . Skipping it.
I0511 07:18:58.794118 3144 dxcore.c:226] Creating a new WDDM Adapter for hAdapter:40000040 luid:22b6eb
I0511 07:18:58.803501 3144 dxcore.c:267] Adding new adapter via dxcore hAdapter:40000040 luid:22b6eb wddm version:3000
I0511 07:18:58.803513 3144 dxcore.c:325] dxcore layer initialized successfully
W0511 07:18:58.803740 3144 nvc.c:397] skipping kernel modules load on WSL
I0511 07:18:58.803860 3151 driver.c:101] starting driver service
E0511 07:18:58.817411 3151 driver.c:168] could not start driver service: load library failed: /usr/lib/wsl/drivers/nv_dispi.inf_amd64_c7cfdb57b3071e7f/libnvidia-ml.so.1: undefined symbol: devicesetgpcclkvfoffset
I0511 07:18:58.817505 3144 driver.c:203] driver service terminated successfully

can you help me ?
I am in windows(os build 21370.1) wsl2. and card=2060. and without docker the BlackScholes can run successful.
I have the file /dev/dxg
ls /usr/lib/wsl/lib/
libcuda.so libd3d12.so ‘libdxcore (1).so’ libnvidia-ml.so.1
libcuda.so.1 ‘libd3d12core (1).so’ libdxcore.so libnvidia-opticalflow.so.1
libcuda.so.1.1 libd3d12core.so libnvcuvid.so.1 libnvwgf2umx.so
‘libd3d12 (1).so’ libdirectml.so libnvidia-encode.so.1 nvidia-smi

I have same isuue!
The only difference for my case is Linux version 5.10.16.3-microsoft-standard-WSL2(The version was 5.4.* but I execute ‘wsl --update’ then I got v5.10.16.3)

$ sudo nvidia-container-cli -k -d /dev/tty info
has same output.

I tried to reinstall Ubuntu18.04 on wsl2 and cuda, three times but same error occurred.

How to solve this problem?

Also, Docker Desktop backend instead of nvidia-docker2 installation mentioned in the Document [CUDA on WSL :: CUDA Toolkit Documentation (nvidia.com)], has same problem.
So, I think nvidia driver has a problem.

I want to try to use old version of nvidia driver 465.21 because someone said it works everything fine on the similar situation (470.14 - WSL with W10 Build 21343 - NVIDIA-SMI error - #20 by AKAMolasses)

Is there any way to get old version of nvidia driver for wsl?
I couldn’t find it on the nvidia website.(normal old ver. nvidia driver has found but it is not support for wsl, in my understanding)

@asobod11138 Dunno if I can post mega links here. For driver 465.42 look for “WDDM 3.0 / 465.42” on google.

Yes probably this symbol with latest driver is issue, symbol: devicesetgpcclkvfoffset.

Hi there - please see my post : Guide to run CUDA + WSL + Docker with latest versions (21382 Windows build + 470.14 Nvidia)

Most of the guides are now obsolete since Docker is supporting GPU in Windows via their own WSL2 integration. The guide goes into the steps to get it working correctly.

All the best!

Thanks, it works !

I tried earlier with latest version which did not work.

Seems they have some issues to be fixed. Sadly those are blocker !! Lost so much of time.
https://github.com/NVIDIA/nvidia-docker/issues/1496

Latest driver 470.76 fixes the problem.

Just install the driver and restart the docker service with sudo service docker restart

CUDA on WSL | NVIDIA Developer

1 Like

I had the same error, followed everything except joining the windows insider programme. I thought it’s not necessary since I’ve already installed WSL2, but it is necessary. Since the joining the insider programme and upgrading to Windows 11, the same code can run now using the GPU.

I was having these issues until I upgraded from windows 10 to windows 11, and also downloaded the latest Geforce drivers for my laptop GPU Official GeForce Drivers | NVIDIA. Restart the computer and confirm the WDDM version to be 2.9 or higher (was 3.0 for my case)

I am using windows 10 V21H1 19043. I read the tutorial and I think I installed everything

docker run --rm -it --gpus=all nvcr.io/nvidia/k8s/cuda-sample:nbody
docker: Error response from daemon: OCI runtime create failed: container_linux.go:380: starting container process caused: process_linux.go:545: container init caused: Running hook #0:: error running hook: exit status 1, stdout: , stderr: nvidia-container-cli: initialization error: driver error: failed to process request: unknown.

I f I try to stop Service docker
sudo service docker stop
docker: unrecognized service

I don’t have this folder /dev/dxg.

Here are results from commands
ls /usr/lib/wsl/lib
libcuda.so libd3d12.so libnvcuvid.so libnvidia-encode.so libnvidia-opticalflow.so nvidia-smi
libcuda.so.1 libd3d12core.so libnvcuvid.so.1 libnvidia-encode.so.1 libnvidia-opticalflow.so.1
libcuda.so.1.1 libdxcore.so libnvdxdlkernels.so libnvidia-ml.so.1 libnvwgf2umx.so

uname -a
Linux Vladi-PC 5.10.60.1-microsoft-standard-WSL2 #1 SMP Wed Aug 25 23:20:18 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

i also got the same error while trying to setup the tao.
command that gave me this error is :
sudo docker run --rm --runtime=nvidia --gpus all nvidia/cuda:11.6.2-base-ubuntu20.04 nvidia-smi

error message :
docker: Error response from daemon: failed to create task for container: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error running hook #0: error running hook: exit status 1, stdout: , stderr: Auto-detected mode as ‘legacy’
nvidia-container-cli: initialization error: change root failed: no such file or directory: unknown.

documentation i was trying to follow is : TAO Toolkit Quick Start Guide - NVIDIA Docs