Docker: failed to mount overlay: no such device storage-driver=overlay2

I have already reviewed the similar issue:

Which does not help.

Setup Details:
Hardware: Jetson Xavier Nx
OS: Linux nvidia-desktop 4.9.253-tegra #1 SMP PREEMPT Wed Apr 13 13:41:15 CST 2022 aarch64 aarch64 aarch64 GNU/Linux
Docker version 20.10.12, build 20.10.12-0ubuntu2~18.04.1

R32 (release), REVISION: 7.3, GCID: 31982016, BOARD: t186ref, EABI: aarch64, DATE: Tue Nov 22 17:32:54 UTC 2022

When I go to run the nvidia docker containers, I get the following error message:
‘’’
sudo dockerd
INFO[2023-08-18T21:47:26.253603724+08:00] Starting up
INFO[2023-08-18T21:47:26.265183922+08:00] detected 127.0.0.53 nameserver, assuming systemd-resolved, so using resolv.conf: /run/systemd/resolve/resolv.conf
INFO[2023-08-18T21:47:26.267152351+08:00] parsed scheme: “unix” module=grpc
INFO[2023-08-18T21:47:26.267221599+08:00] scheme “unix” not registered, fallback to default scheme module=grpc
INFO[2023-08-18T21:47:26.267304485+08:00] ccResolverWrapper: sending update to cc: {[{unix:///run/containerd/containerd.sock 0 }] } module=grpc
INFO[2023-08-18T21:47:26.267362400+08:00] ClientConn switching balancer to “pick_first” module=grpc
INFO[2023-08-18T21:47:26.270441646+08:00] parsed scheme: “unix” module=grpc
INFO[2023-08-18T21:47:26.270516369+08:00] scheme “unix” not registered, fallback to default scheme module=grpc
INFO[2023-08-18T21:47:26.270582576+08:00] ccResolverWrapper: sending update to cc: {[{unix:///run/containerd/containerd.sock 0 }] } module=grpc
INFO[2023-08-18T21:47:26.270621538+08:00] ClientConn switching balancer to “pick_first” module=grpc
ERRO[2023-08-18T21:47:26.278662980+08:00] failed to mount overlay: no such device storage-driver=overlay2
ERRO[2023-08-18T21:47:26.279116566+08:00] [graphdriver] prior storage driver overlay2 failed: driver not supported
failed to start daemon: error initializing graphdriver: driver not supported
‘’’

How do I resolve this error, to be able to run docker on this machine ?

Hi,
Please refer to
NVIDIA L4T Base | NVIDIA NGC

And run the docker for r32.7.3. It is supposed to work if you follow the steps.

Hi DaneLLL,

To clarify, this error message occurs even when I go to run any docker command - not just the nvidia docker containers. For example, if I run ‘docker container ls’, or ‘docker images’, I get the error message, ‘could not connect to Docker daemon’, and systemctl shows the error above.

This is more fundamental than the Nvidia-specific docker containers, it applies all docker commands

Hi,
We don’t see the print while running the docker for Jetson platform. Please share your docker command for reference.

Try to issue a docker command:

sudo docker container ls:
Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?

Then I check the status:

systemctl status docker.service
● docker.service - Docker Application Container Engine
   Loaded: loaded (/lib/systemd/system/docker.service; enabled; vendor preset: enabled)
   Active: failed (Result: exit-code) since Mon 2023-08-21 19:25:55 CST; 20h ago
     Docs: https://docs.docker.com
  Process: 32594 ExecStart=/usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock (code=exited, status=1/FAILURE)
 Main PID: 32594 (code=exited, status=1/FAILURE)

Aug 21 19:25:53 nvidia-desktop systemd[1]: Failed to start Docker Application Container Engine.
Aug 21 19:25:55 nvidia-desktop systemd[1]: docker.service: Service hold-off time over, scheduling restart.
Aug 21 19:25:55 nvidia-desktop systemd[1]: docker.service: Scheduled restart job, restart counter is at 3.
Aug 21 19:25:55 nvidia-desktop systemd[1]: Stopped Docker Application Container Engine.
Aug 21 19:25:55 nvidia-desktop systemd[1]: docker.service: Start request repeated too quickly.
Aug 21 19:25:55 nvidia-desktop systemd[1]: docker.service: Failed with result 'exit-code'.
Aug 21 19:25:55 nvidia-desktop systemd[1]: Failed to start Docker Application Container Engine.

Check with journalctl:

journalctl -eu docker
Aug 21 19:25:48 nvidia-desktop dockerd[32446]: time="2023-08-21T19:25:48.584716135+08:00" level=info msg="parsed scheme: \"unix\"" module=grpc
Aug 21 19:25:48 nvidia-desktop dockerd[32446]: time="2023-08-21T19:25:48.584792073+08:00" level=info msg="scheme \"unix\" not registered, fallback to default scheme" module=grpc
Aug 21 19:25:48 nvidia-desktop dockerd[32446]: time="2023-08-21T19:25:48.584899947+08:00" level=info msg="ccResolverWrapper: sending update to cc: {[{unix:///run/containerd/containerd.sock  <nil> 0 <nil>}] <nil>
Aug 21 19:25:48 nvidia-desktop dockerd[32446]: time="2023-08-21T19:25:48.584965868+08:00" level=info msg="ClientConn switching balancer to \"pick_first\"" module=grpc
Aug 21 19:25:48 nvidia-desktop dockerd[32446]: time="2023-08-21T19:25:48.596805345+08:00" level=error msg="failed to mount overlay: no such device" storage-driver=overlay2
Aug 21 19:25:48 nvidia-desktop dockerd[32446]: time="2023-08-21T19:25:48.597043878+08:00" level=error msg="exec: \"fuse-overlayfs\": executable file not found in $PATH" storage-driver=fuse-overlayfs
Aug 21 19:25:48 nvidia-desktop dockerd[32446]: time="2023-08-21T19:25:48.601805128+08:00" level=error msg="AUFS was not found in /proc/filesystems" storage-driver=aufs
Aug 21 19:25:48 nvidia-desktop dockerd[32446]: time="2023-08-21T19:25:48.608008232+08:00" level=error msg="failed to mount overlay: no such device" storage-driver=overlay
Aug 21 19:25:48 nvidia-desktop dockerd[32446]: failed to start daemon: error initializing graphdriver: devicemapper: Error running deviceCreate (CreatePool) dm_task_run failed
Aug 21 19:25:48 nvidia-desktop systemd[1]: docker.service: Main process exited, code=exited, status=1/FAILURE
Aug 21 19:25:48 nvidia-desktop systemd[1]: docker.service: Failed with result 'exit-code'.
Aug 21 19:25:48 nvidia-desktop systemd[1]: Failed to start Docker Application Container Engine.

I have tried unsuccessfully the steps outlined here:

Which made no difference.

Hi,
Please try the commands:

$ xhost +
$ sudo docker run -it --rm --net=host --runtime nvidia -e DISPLAY=$DISPLAY -v /tmp/.X11-unix/:/tmp/.X11-unix nvcr.io/nvidia/l4t-base:r32.7.1
sudo docker run -it --rm --net=host --runtime nvidia -e DISPLAY=$DISPLAY -v /tmp/.X11-unix/:/tmp/.X11-unix nvcr.io/nvidia/l4t-base:r32.7.1
[sudo] password for nvidia: 
docker: Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?.
See 'docker run --help'.

The issue is not with the nvidia docker, or nvidia docker runtime, it’s with the OS and docker itself.

Hi,
We can run the command successfully:

nvidia@nvidia-desktop:~$ sudo docker run -it --rm --net=host --runtime nvidia -e
 DISPLAY=$DISPLAY -v /tmp/.X11-unix/:/tmp/.X11-unix nvcr.io/nvidia/l4t-base:r32.
7.1
[sudo] password for nvidia:
Unable to find image 'nvcr.io/nvidia/l4t-base:r32.7.1' locally
r32.7.1: Pulling from nvidia/l4t-base
f46992f278c2: Pull complete
d0ec296fcb76: Pull complete
9e18ddc8ca7a: Pull complete
457ba495c8e5: Pull complete
71bca45e35bd: Pull complete
644761cdc735: Pull complete
11628dbc31eb: Pull complete
d364c3700c33: Pull complete
01869d070b2e: Pull complete
cc3009375042: Pull complete
3e182d6364dc: Pull complete
f72feb4812f9: Pull complete
151eb940bbec: Pull complete
0e9dda2495b9: Pull complete
0e78bdc2f297: Pull complete
8dc68d594a4e: Pull complete
Digest: sha256:a374d81695f172fcda9da8db23f60d8bc35948762f71f3ee69564b4f6be5ef1c
Status: Downloaded newer image for nvcr.io/nvidia/l4t-base:r32.7.1
root@nvidia-desktop:/#

Please check if you can re-flash system image and try again.

Reflashing the system image is not an option for us, as the device is not physically accessible.

Are there any other steps we can try ?

Hi @DaneLLL - are there any other suggested steps ?

The device is remote and while it can be accessed via SSH it cannot be physically accessed for reflashing.

Hi,
We would suggest physically access the board and do clean re-flashing. This would be more efficient than remotely doing trial and error.

Thanks for the suggestion, however in this case that option does not exist, as the device is not physically accessible.

While I appreciate that remote trial and error is difficult, would you be able to provide any suggestions or paths to attempt ?

When I run the command to check for the driver, I get:

modprobe: FATAL: Module overlay not found in directory /lib/modules/4.9.253-tegra

This is present on other, supposedly identical flashed boards.

Update - adding some extra information:
On the device, there seem to be two kernels installed:

ls /lib/modules
4.9.253-tegra  4.9.299-tegra

overlayfs is in the 4.9.299-tegra kernel:

ls 4.9.299-tegra/kernel/fs
binfmt_misc.ko  btrfs  cifs  fuse  nfs_common  nfsd  overlayfs

But missing in the other (running) kernel - note no ‘fs’ directory here:

ls 4.9.253-tegra/kernel/
drivers

but uname -a returns:

Linux nvidia-desktop 4.9.253-tegra

Is it possible to use the new kernel ? Is that a risky operation over a remote connection ?

1 Like

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.