Can't start nvidia docker after moving docker to ssd

dragan.bogatic · January 24, 2021, 5:15am

I followed the web instruction Software Setup - Jetson Xavier - RACECAR/Xthat explains how to move docker folder to ssd. Everything went smooth and I tested with docker Hello World and it worked as well. However, when I downloaded nvidia tensorflow container and ran it it gave me error. I would appreciate if someone can help. Thank you.

dragan@dragan-desktop:~$ sudo docker run -it --rm --runtime nvidia --network host nvcr.io/nvidia/l4t-tensorflow:r32.4.4-tf2.3-py3

docker: Error response from daemon: OCI runtime create failed: container_linux.go:349: starting container process caused “process_linux.go:449: container init caused "process_linux.go:432: running prestart hook 0 caused \"error running hook: exit status 1, stdout: , stderr: exec command: [/usr/bin/nvidia-container-cli --load-kmods configure --ldconfig=@/sbin/ldconfig.real --device=all --compute --compat32 --graphics --utility --video --display --pid=23175 /xavier_ssd/home/dragan/docker/overlay2/966a2684ed283da28a5ec67bc7648c7a590533efabef3537cf194ebd1af6a5b8/merged]\\nnvidia-container-cli: mount error: file creation failed: /xavier_ssd/home/dragan/docker/overlay2/966a2684ed283da28a5ec67bc7648c7a590533efabef3537cf194ebd1af6a5b8/merged/usr/lib/aarch64-linux-gnu/libnvidia-fatbinaryloader.so.440.18: file exists\\n\""”: unknown.

dragan@dragan-desktop:~$ sudo service docker start
dragan@dragan-desktop:~$ sudo docker run hello-world

Hello from Docker!
This message shows that your installation appears to be working correctly.

To generate this message, Docker took the following steps:

The Docker client contacted the Docker daemon.
The Docker daemon pulled the “hello-world” image from the Docker Hub.
(arm64v8)
The Docker daemon created a new container from that image which runs the
executable that produces the output you are currently reading.
The Docker daemon streamed that output to the Docker client, which sent it
to your terminal.

To try something more ambitious, you can run an Ubuntu container with:
$ docker run -it ubuntu bash

Share images, automate workflows, and more with a free Docker ID:
https://hub.docker.com/

For more examples and ideas, visit:
Get started | Docker Docs

Kind regards,
Dragan

AastaLLL · January 25, 2021, 2:47am

Hi

Would you mind creating a symlink as mentioned in the below link first?

Suppose your new folder is /XavierSSD500/var/lib/docker.
Please try this:

$ ln -s /XavierSSD500/var/lib/docker /var/lib/docker

Thanks.

dragan.bogatic · January 25, 2021, 5:11am

Hello,
I have no issue switching default docker image folder where it saves files. I have issue with docker starting after that. It starts docker hello-world just fine.

This is my daemon.json file:

{
“runtimes”: {
“nvidia”: {
“path”: “nvidia-container-runtime”,
“runtimeArgs”:
}
},

"data-root": "/mnt/xavier_ssd/Docker"

}

I flashed my xavier again and same thing happens after redoing things:

sudo docker run -it --rm --runtime nvidia --network host nvcr.io/nvidia/l4t-tensorflow:r32.4.4-tf2.3-py3

docker: Error response from daemon: OCI runtime create failed: container_linux.go:349: starting container process caused “process_linux.go:449: container init caused "process_linux.go:432: running prestart hook 0 caused \"error running hook: exit status 1, stdout: , stderr: exec command: [/usr/bin/nvidia-container-cli --load-kmods configure --ldconfig=@/sbin/ldconfig.real --device=all --compute --compat32 --graphics --utility --video --display --pid=14138 /mnt/xavier_ssd/Docker/overlay2/8a0ea8137638b55a2fda06b3c15ec3a6a7e38f175b6ca350c40f31a1ae4e64a4/merged]\\nnvidia-container-cli: mount error: file creation failed: /mnt/xavier_ssd/Docker/overlay2/8a0ea8137638b55a2fda06b3c15ec3a6a7e38f175b6ca350c40f31a1ae4e64a4/merged/usr/lib/aarch64-linux-gnu/libnvidia-fatbinaryloader.so.440.18: file exists\\n\""”: unknown.

Any other suggestions how I can make this work? Thank you.

Dragan

AastaLLL · January 26, 2021, 7:47am

Hi,

The error occurs when the nvidia runtime tries to mount the /usr/lib/aarch64-linux-gnu/ but found it already exists.
A similar issue can be found in the below link:

github.com/NVIDIA/nvidia-docker

nvidia-container-cli: mount error

opened 01:44PM - 10 Sep 18 UTC

closed 08:24PM - 23 Nov 18 UTC

xiaoxinyi

``` OCI runtime create failed: container_linux.go:348: starting container proce…ss caused "process_linux.go:402: container init caused \"process_linux.go :385: running prestart hook 1 caused \\\"error running hook: exit status 1, stdout: , stderr: exec command: [/usr/bin/nvidia-container-cli --load-kmods configure --ldconfig=@/sbin/ldconfig.real --device=all --com pute --compat32 --graphics --utility --video --display --require=cuda>=8.0 --pid=32239 /var/lib/docker/overlay2/bd4f3570d4eb99d9df1117bd3751a00c8c0eb63429e65eba2a196c6c543cc52f/merged]\\\\nnvidia-container-cli: m ount error: file creation failed: /var/lib/docker/overlay2/bd4f3570d4eb99d9df1117bd3751a00c8c0eb63429e65eba2a196c6c543cc52f/merged/usr/bin/nvidia-smi: file exists\\\\n\\\"\"": unknown ```

To solve this, could you check if any command that also mounts the folder and leads to the conflict?

Thanks.

dragan.bogatic · January 26, 2021, 1:03pm

Hi,
I don’t think I can figure this one on my own. So, I decided to abandon this approach and go with a regular install of python packages instead of using containers.

The reason I wanted to use containers was that i was getting error message when running LSTM model at MinMax Scaler cell but you provided a solution in another thread https://forums.developer.nvidia.com/t/error-importerror-usr-lib-aarch64-linux-gnu-libgomp-so-1-cannot-allocate-memory-in-static-tls-block-i-looked-through-available-threads-already/166494 that seems to work now if I run a terminal line export LD_PRELOAD=/usr/lib/aarch64-linux-gnu/libgomp.so.1 prior to running Jupyter. I appreciate your help, I think i will be ok now.

Regards,
Dragan

Topic		Replies	Views
Error running GPU enabled docker containers Jetson AGX Xavier docker	2	776	October 17, 2021
Cannot run docker with --runtime nvidia Jetson Xavier NX docker , containers	8	7969	December 22, 2021
Docker:error Docker and NVIDIA Docker	0	773	July 29, 2021
Unable to run nvidia docker Jetson Xavier NX docker	4	3788	December 8, 2021
Run docker image error DeepStream SDK	2	410	September 18, 2021
Error running GPU enabled docker containers on jetson xavier Jetson AGX Xavier	2	829	October 11, 2021
Docker: failed to mount overlay: no such device storage-driver=overlay2 Jetson Xavier NX docker	13	5329	October 9, 2023
Jetson Xavier NX can't run docker container with --runtime nvidia Jetson Xavier NX docker	3	2417	December 28, 2021
Moving docker directory to nvme ssd (CloudNative-Demo on Jetson) Jetson AGX Xavier docker	3	930	September 30, 2021
~$ ./docker_dli_run.sh error message - on class #3 Getting Started with AI on Jetson Nano Jetson Nano docker	2	504	October 15, 2021

Can't start nvidia docker after moving docker to ssd

Related topics