Jetson AGX Xavier cannot start a basic docker

Hello,

I just got a new Jetson AGX Xavier board and tried to start a basic ml container from NGC, however, it couldn’t run.

sudo docker run -it --rm --runtime nvidia --network host nvcr.io/nvidia/l4t-ml:r32.4.4-py3
Unable to find image 'nvcr.io/nvidia/l4t-ml:r32.4.4-py3' locally
r32.4.4-py3: Pulling from nvidia/l4t-ml
e74fe6ef6bd6: Already exists 
7dcdd1c8f1d2: Already exists 
148ea20d31e0: Already exists 
fbc4cd4d050b: Already exists 
a21b0b3d8206: Already exists 
4ba0c94f9855: Already exists 
6fb19c1062d0: Already exists 
84ff17ad4b18: Already exists 
5ac903fdc4a8: Already exists 
ecf00917e120: Already exists 
30d000a9cd22: Already exists 
a26b515ffe8f: Already exists 
a199cb2dd71e: Already exists 
c4f4e0f882d3: Already exists 
3e956de9ea4b: Already exists 
e26b78d1aaed: Already exists 
ac42496d0bc2: Already exists 
7db2983f5802: Already exists 
7caf5120c657: Pull complete 
515bee6e4e21: Pull complete 
665a812071a4: Pull complete 
fa149c1c8827: Pull complete 
f159a33bd92b: Pull complete 
9d574cc5f4bd: Pull complete 
9b6f3604c42b: Pull complete 
8999d5401777: Pull complete 
445213ad4422: Pull complete 
fd4f94a64928: Pull complete 
bf481d5e0a20: Pull complete 
d9d3a4f227ec: Pull complete 
871ab20f86fa: Pull complete 
42382b377b81: Pull complete 
355caf84fb48: Pull complete 
08daa9581bc5: Pull complete 
f31b8c487ed8: Pull complete 
Digest: sha256:c3c1ebea04bbb428e50480b52c06af03b7172fbda1d5309c60f735f9e0797663
Status: Downloaded newer image for nvcr.io/nvidia/l4t-ml:r32.4.4-py3
docker: Error response from daemon: OCI runtime create failed: container_linux.go:367: starting container process caused: process_linux.go:495: container init caused: Running hook #0:: error running hook: exit status 1, stdout: , stderr: nvidia-container-cli: initialization error: driver error: failed to process request: unknown.

Here is basic info about the board:

 - NVIDIA Jetson AGX Xavier [16GB]
   * Jetpack 4.4 [L4T 32.4.3]
   * NV Power Mode: MODE_30W_ALL - Type: 3
   * jetson_stats.service: active
 - Libraries:
   * CUDA: 10.2.89
   * cuDNN: 8.0.0.180
   * TensorRT: 7.1.3.0
   * Visionworks: 1.6.0.501
   * OpenCV: 4.1.1 compiled CUDA: NO
   * VPI: 0.3.7
   * Vulkan: 1.2.70

Docker:

Client: Docker Engine - Community
 Version:           20.10.5
 API version:       1.41
 Go version:        go1.13.15
 Git commit:        55c4c88
 Built:             Tue Mar  2 20:18:54 2021
 OS/Arch:           linux/arm64
 Context:           default
 Experimental:      true

Server: Docker Engine - Community
 Engine:
  Version:          20.10.5
  API version:      1.41 (minimum version 1.12)
  Go version:       go1.13.15
  Git commit:       363e9a8
  Built:            Tue Mar  2 20:16:53 2021
  OS/Arch:          linux/arm64
  Experimental:     false
 containerd:
  Version:          1.4.4
  GitCommit:        05f951a3781f4f2c1911b05e61c160e9c30eaa8e
 runc:
  Version:          1.0.0-rc93
  GitCommit:        12644e614e25b05da6fd08a38ffa0cfe1903fdec
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0

Docker runtime:

cat /etc/docker/daemon.json
{
    "runtimes": {
        "nvidia": {
            "path": "nvidia-container-runtime",
            "runtimeArgs": []
        }
    }
}

It runs perfectly fine with my Jetson nano and Jetson Tx2

Hi,

Your environment is r32.4.3.
Could you give below container a try?

$ sudo docker run -it --rm --runtime nvidia --network host nvcr.io/nvidia/l4t-ml:r32.4.3-py3

Thanks.

Thanks @AastaLLL for looking into this. I just have a quick try, it still returned the same error:

 sudo docker run -it --rm --runtime nvidia --network host nvcr.io/nvidia/l4t-ml:r32.4.3-py3
Unable to find image 'nvcr.io/nvidia/l4t-ml:r32.4.3-py3' locally
r32.4.3-py3: Pulling from nvidia/l4t-ml
c796196a5194: Pulling fs layer 
9f1f9f625de3: Pulling fs layer 
692bb77a7fc0: Pulling fs layer 
7c6fea64666e: Pulling fs layer 
2e07510b3b6f: Pulling fs layer 
f3cef679c558: Pull complete 
01067c06de71: Pull complete 
607a871f53a8: Pull complete 
77c272c815a4: Pull complete 
d6e9ae6c556e: Pull complete 
6c31b14f8325: Pull complete 
da342f04dbe2: Pull complete 
bc1f73093866: Pull complete 
801ec5982390: Pull complete 
20f4f4b58bf1: Pull complete 
998db3ceb21f: Pull complete 
8f6eafd35194: Pull complete 
c3cf45f768ae: Pull complete 
7b3abe05cce9: Pull complete 
8977efcd7cb6: Pull complete 
5fa352cd64d8: Pull complete 
f858db5ee73c: Pull complete 
c25066acf55c: Pull complete 
cee7c26afb29: Pull complete 
e95be1d362be: Extracting [==================================================>]    307MB/307MB
e95be1d362be: Pull complete 
493dda69c3c8: Pull complete 
8e6fc188d98a: Pull complete 
8ca5b0b1221c: Pull complete 
e927bc5b62ba: Pull complete 
c62d62d4b6d8: Pull complete 
cd1b219d0452: Pull complete 
3604ec2ef5ba: Pull complete 
48514e1a3709: Pull complete 
f0491a7196f1: Pull complete 
b68bcdfc7c06: Pull complete 
356c673f0194: Pull complete 
Digest: sha256:61de72a4d5c34841c2135004f52b6813101610a4829817d93b8736298cc9ac39
Status: Downloaded newer image for nvcr.io/nvidia/l4t-ml:r32.4.3-py3
docker: Error response from daemon: OCI runtime create failed: container_linux.go:367: starting container process caused: process_linux.go:495: container init caused: Running hook #0:: error running hook: exit status 1, stdout: , stderr: nvidia-container-cli: initialization error: driver error: failed to process request: unknown.

Hello @tnaduc ,

Can you try it with newer image?

sudo docker run -it --rm --runtime nvidia --network host nvcr.io/nvidia/l4t-ml:r32.5.0-py3

Regards.

Hi @ozguryildiz ,

Thank you for your advice, I tried it, still the same result

sudo docker run -it --rm --runtime nvidia --network host nvcr.io/nvidia/l4t-ml:r32.5.0-py3
[sudo] password for mic-730ai: 
Unable to find image 'nvcr.io/nvidia/l4t-ml:r32.5.0-py3' locally
r32.5.0-py3: Pulling from nvidia/l4t-ml
5000a6c32c5a: Pull complete 
8e855b69096a: Pull complete 
8db8dbbd4bb9: Pull complete 
833dc3235950: Pull complete 
f79d264135a3: Pull complete 
1c40f77bb35b: Pull complete 
1990ecf0bfb7: Pull complete 
c8ffbfd7f0aa: Pull complete 
ba785779122a: Pull complete 
024ce79b6790: Pull complete 
9b09da3b5483: Pull complete 
17f974a43cf9: Pull complete 
211b56b73ff1: Pull complete 
78aca4be1f3b: Pull complete 
95f34310bbda: Pull complete 
678c9d1557e9: Pull complete 
ec17ad7cab01: Pull complete 
08fb1eee5328: Pull complete 
df804e245232: Pull complete 
4f9a01a0e955: Pull complete 
a44515425a95: Pull complete 
cf055aaf10ad: Pull complete 
844538a014b9: Pull complete 
159036d408be: Pull complete 
ae321b977e2f: Pull complete 
ae4d2ced71ce: Pull complete 
1bc11ebf8522: Pull complete 
ef6ff4fc67ed: Pull complete 
3b45df8b8d80: Pull complete 
a1a094c74107: Pull complete 
7400833e9fab: Pull complete 
21608f9f63e2: Pull complete 
50b80108db64: Pull complete 
800ad6d6d249: Pull complete 
a5297af96097: Pull complete 
5d3a862a0013: Pull complete 
418524dee780: Pull complete 
Digest: sha256:af8155d948946e76fb398a93a726ee4e4f06a69574284a8bc8929f0731efdc61
Status: Downloaded newer image for nvcr.io/nvidia/l4t-ml:r32.5.0-py3
docker: Error response from daemon: OCI runtime create failed: container_linux.go:367: starting container process caused: process_linux.go:495: container init caused: Running hook #0:: error running hook: exit status 1, stdout: , stderr: nvidia-container-cli: initialization error: driver error: failed to process request: unknown.

The same for the lt4 base:

sudo docker run -it --rm --net=host --runtime nvidia  -e DISPLAY=$DISPLAY -v /tmp/.X11-unix/:/tmp/.X11-unix nvcr.io/nvidia/l4t-base:r32.4.3
docker: Error response from daemon: OCI runtime create failed: container_linux.go:367: starting container process caused: process_linux.go:495: container init caused: Running hook #0:: error running hook: exit status 1, stdout: , stderr: nvidia-container-cli: initialization error: driver error: failed to process request: unknown.

Hello @tnaduc ,

Please type the below command and check what it happens.

sudo apt-get update && sudo apt-get install -y nvidia-container-toolkit

Hope it works.

Hi @ozguryildiz , thank you.
Tried that and still having same outcome. :(

Can you check this post?