L4t-ml:r35.2.1-py3 goes down few seconds after booting using docker

Hello I’m suing a Jetson Xavier AGX : R35 (release), REVISION: 3.1, GCID: 32827747, BOARD: t186ref, EABI: aarch64, DATE: Sun Mar 19 15:19:21 UTC 2023

with docker daemon having this configuration:

{
“data-root”: “/home/jetson/docker-data”,
“default-runtime”: “nvidia”,
“runtimes”: {
“nvidia”: {
“path”: “/usr/bin/nvidia-container-runtime”,
“runtimeArgs”:
}
}
}

and when I try to run sudo docker run -it --rm --runtime nvidia --network host nvcr.io/nvidia/l4t-ml:r35.2.1-py3. I get:

allow 10 sec for JupyterLab to start @ http://192.168.1.29:8888 (password nvidia)
JupterLab logging location: /var/log/jupyter.log (inside the container)

and after that the container goes down and docker command become unusable:
root@Scylla:/# docker
bash: docker: command not found

It appears that the interactive arg (-it) makes docker to be unusable because when I run the same command without it the container still crashes but I have no trouble to use docker command.

I tried to get some logs but even /var/log/jupyter.log is empty. I have looked for ours but I can’t solve this. I tried to uninstall and reinstall docker and nvidia-container-runtime but it didn’t change a thing.

Do you have any idea where I should look to trouble shoot this please ?

@yasser.antonio.leote.cher are you sure this just isn’t the terminal running inside the container? The docker packages aren’t installed inside the container, so it’s to be expected that command wouldn’t be found. The root/# prompt would seem indicate that’s running in the container.

Unless l4t-ml is started with a different run command, it will start the JupyterLab server in the background, then go on to give you that prompt. If you suspect the container to be crashing, you could try starting it with /bin/bash and that should override it’s default entrypoint.

Damn it was (big palmface). Still I didn’t succeed to get the container staying up without -it so I just added tty= true in my compose to deploy it without having it crashing. I dont ting that’s a really good practice to require a psoeudo interaction to avoid the endpoint terminating the container.

Thank you for your help

@yasser.antonio.leote.cher I don’t think the container is “crashing”, I think the default entrypoint command evaluates to /bin/bash and without tty, bash exits. If you run your own command, it should stay running without -it or --tty:

sudo docker run --runtime nvidia --network host --name base \
    --volume /media/nvidia/NVME/test:/mount \
    nvcr.io/nvidia/l4t-base:r35.2.1 \
    /mount/loop.sh

loop 1 -- Tue Jun 20 14:55:17 UTC 2023
loop 2 -- Tue Jun 20 14:55:18 UTC 2023
loop 3 -- Tue Jun 20 14:55:19 UTC 2023
loop 4 -- Tue Jun 20 14:55:20 UTC 2023
loop 5 -- Tue Jun 20 14:55:21 UTC 2023
loop 6 -- Tue Jun 20 14:55:22 UTC 2023
loop 7 -- Tue Jun 20 14:55:23 UTC 2023
loop 8 -- Tue Jun 20 14:55:24 UTC 2023
loop 9 -- Tue Jun 20 14:55:25 UTC 2023
loop 10 -- Tue Jun 20 14:55:26 UTC 2023

loop.sh test script:

#!/usr/bin/env bash

for count in {1..10};
do
	echo "loop $count -- $(date)"
	sleep 1.0
done

This behavior doesn’t seem specific to l4t-ml container or any container in particular. If you start the container in detached mode (-d) and want to interact with a terminal for it via docker attach, it should have been started with -it

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.