Nvidia-docker2 permissions problem

Hi folks -

We are having trouble using nvidia-docker2 on Orin running L4T 35.3.1. The docker build process does not appear to respect permissions. This docker file:

FROM alpine:3.18

ENV NS_USER test
ENV UID 1001

RUN adduser -D -u ${UID} ${NS_USER} && \
    passwd -u ${NS_USER} 

RUN chown -R ${UID}:${UID} /home/${NS_USER}

CMD ls -l /home

Give this result when run:

docker run chowntest:0.1.1
total 12
drwxr-xr-x    1 root     root          4096 Aug  3 22:13 .
drwxr-xr-x    1 root     root          4096 Aug  3 22:14 ..
drwxr-sr-x    1 root     root          4096 Aug  3 22:13 test

The container SHOULD show that test is owned by test:test. On x86 Ubuntu and Mac, the same docker file shows the /home/test directory owned by test:test. This behavior has persisted across a couple of different Jetpacks and AGX Orin variants. We can run the image interactively and set the permission, but of course it does not persist through reboots.

Has anyone else seen anything similiar? Any suggestions for what might be happening?

Thanks,
sam

Hi,

Do you have any output log can share with us?
Does the RUN chown -R ${UID}:${UID} /home/${NS_USER} return any error?

Thanks.

Hi Aasta -

The Docker build doesn’t throw any errors. But after it is built the image just doesn’t respect the chown command.

We were able to reproduce the errors on a Raspberry Pi, suggesting maybe this has something to do with aarch64.

We also found suggestive hints in Ubuntu Bug reports, that claim overlayfs is the problem: Bug #1659417 “docker permission issues with overlay2 storage dri... : Bugs : linux package : Ubuntu

Finally, we were able to find a work around that strongly suggests it is something related to overlayfs. This slight modification works as intended:

CMD chown -R ${UID}:${UID} /home/${NS_USER} && \
    ls -l /home

at the expense of being very hacky. The runtime permission change “sticks”, we think b/c the overlayfs is not involved.

Any suggestions welcome.

【Solution】
Downgrade docker version.

sudo apt install containerd=1.6.12-0ubuntu1~20.04.3
sudo apt install docker.io=20.10.21-0ubuntu1~20.04.2

【Cause】
I was investigating this since the same issue occurred in JetPack 5.1.2.
And today, I noticed that this is also happening with the JetPack 5.1.1 that I installed via the SDKManager.
I’ve identified the cause and will describe it below.

This issue is due to the recent updates in the versions of docker.io and containerd.

The versions that do not work are as follows.

apt list --installed | grep container

WARNING: apt does not have a stable CLI interface. Use with caution in scripts.

containerd/focal-updates,now 1.7.2-0ubuntu1~20.04.1 arm64 [installed,automatic]
libnvidia-container-tools/stable,now 1.10.0-1 arm64 [installed]
libnvidia-container0/stable,now 0.11.0+jetpack arm64 [installed]
libnvidia-container1/stable,now 1.10.0-1 arm64 [installed]
nvidia-container-runtime/stable,now 3.9.0-1 all [installed]
nvidia-container-toolkit/stable,now 1.11.0~rc.1-1 arm64 [installed]
apt list --installed | grep docker

WARNING: apt does not have a stable CLI interface. Use with caution in scripts.

docker.io/focal-updates,now 20.10.25-0ubuntu1~20.04.1 arm64 [installed,automatic]
nvidia-docker2/stable,now 2.11.0-1 all [installed]

The versions that work are as follows.

apt list --installed | grep docker

WARNING: apt does not have a stable CLI interface. Use with caution in scripts.

docker.io/now 20.10.21-0ubuntu1~20.04.1 arm64 [installed,upgradable to: 20.10.25-0ubuntu1~20.04.1]
nvidia-docker2/stable,now 2.11.0-1 all [installed]
apt list --installed | grep container

WARNING: apt does not have a stable CLI interface. Use with caution in scripts.

containerd/now 1.6.12-0ubuntu1~20.04.1 arm64 [installed,upgradable to: 1.7.2-0ubuntu1~20.04.1]
libnvidia-container-tools/stable,now 1.10.0-1 arm64 [installed]
libnvidia-container0/stable,now 0.11.0+jetpack arm64 [installed]
libnvidia-container1/stable,now 1.10.0-1 arm64 [installed]
nvidia-container-runtime/stable,now 3.9.0-1 all [installed]
nvidia-container-toolkit/stable,now 1.11.0~rc.1-1 arm64 [installed]

I discovered that in the environment where it operates normally and the environment where user privileges are replaced with root privileges, the versions of containerd and docker.io are different.

Check the versions that can be installed.

apt-cache madison containerd
containerd | 1.7.2-0ubuntu1~20.04.1 | http://ports.ubuntu.com/ubuntu-ports focal-updates/main arm64 Packages
containerd | 1.6.12-0ubuntu1~20.04.3 | http://ports.ubuntu.com/ubuntu-ports focal-security/main arm64 Packages
containerd | 1.3.3-0ubuntu2 | http://ports.ubuntu.com/ubuntu-ports focal/main arm64 Packages
apt-cache madison docker.io
 docker.io | 20.10.25-0ubuntu1~20.04.1 | http://ports.ubuntu.com/ubuntu-ports focal-updates/universe arm64 Packages
 docker.io | 20.10.21-0ubuntu1~20.04.2 | http://ports.ubuntu.com/ubuntu-ports focal-security/universe arm64 Packages
 docker.io | 19.03.8-0ubuntu1 | http://ports.ubuntu.com/ubuntu-ports focal/universe arm64 Packages

There are packages available for installation that were working correctly before, so I will install these.

sudo apt install containerd=1.6.12-0ubuntu1~20.04.3
sudo apt install docker.io=20.10.21-0ubuntu1~20.04.2

This will resolve the permission issues currently occurring when creating a user account in a docker container.

@naisy

Thank you, this is awesome. I confirm that your commands work. Once I installed the downgraded installation, and rebuilt my images with the –no-cache flag, the permissions worked correctly on both Jetson Orin and RPi.

Thanks!

1 Like

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.