Using nvidia-docker containers as non root user

I want to use a CUDA container in Docker as a non root user, but am running into permission problems. Here’s an example Dockerfile:

FROM nvidia/cudagl:11.2.2-runtime-ubuntu18.04

RUN useradd -ms /bin/bash testuser -G video,sudo

USER testuser
ENTRYPOINT "/bin/bash"

Running nvidia-smi gives the following error: Failed to initialize NVML: Insufficient Permissions

My application uses VirtualGL and Xvfb to render Chrome with a GPU if that’s relevant. Works perfectly fine with the root user.

TL;DR - Check the gid of vglusers group on the host. Add this group with the gid in the container, and add the user to this group.

So investigating this a bit, I looked at the nvidia devices in the container:

root@56cef279b83f:/# cd /dev
root@56cef279b83f:/dev# ls -l | grep nvidia
crw-rw---- 1 root 1005 195,   0 Nov 23 23:13 nvidia0
crw-rw---- 1 root 1005 195, 255 Nov 23 23:13 nvidiactl
crw-rw---- 1 root 1005 195, 254 Nov 23 23:13 nvidia-modeset
crw-rw-rw- 1 root root 506,   0 Nov 23 23:13 nvidia-uvm
crw-rw-rw- 1 root root 506,   1 Nov 23 23:13 nvidia-uvm-tools

The nvidia devices belonged to a group with gid 1005. This was odd as there was no group in the container with that ID.

I went to look into the devices on the host, and as per my VGL setup, they belong to root, or the vglusers group.

(venv) jsim@goliath:/var/log$ cd /dev/
(venv) jsim@goliath:/dev$ ls -l | grep nvidia
crw-rw----   1 root vglusers 195,   0 Nov 24 10:13 nvidia0
drwxr-xr-x   2 root root           80 Nov 24 10:31 nvidia-caps
crw-rw----   1 root vglusers 195, 255 Nov 24 10:13 nvidiactl
crw-rw----   1 root vglusers 195, 254 Nov 24 10:13 nvidia-modeset
crw-rw-rw-   1 root root     506,   0 Nov 24 10:13 nvidia-uvm
crw-rw-rw-   1 root root     506,   1 Nov 24 10:13 nvidia-uvm-tools

As it turns out, vglusers has a gid of 1005!

jsim@goliath:/dev$ cat /etc/group | grep vglusers
vglusers:x:1005:jsim

So in my Dockerfile, all I had to do is add the group vglusers with gid 1005, and add my user to this group. Problem solved.

RUN groupadd -g 1005 vglusers && \
    useradd -ms /bin/bash testuser -u 1000 -g 1005 && \
    usermod -a -G video,sudo testuser