TLT mask rcnn error: Tlt.components.docker_handler.docker_handler: Stopping container

The output is the same

OK, so, could you run following command in host PC instead of jupyter notebook?

$ tlt mask_rcnn run ls

Error response from daemon: Container 7db4eef7d34e7141dd3385515f45529a74e1e21d6e2efc6bacdc71f3642c4791 is not running
2021-04-15 09:22:04,215 [INFO] tlt.components.docker_handler.docker_handler: Stopping container.

I am afraid there is something wrong in an old container.
Suggest you to run below.
$ docker rm -fv 7db4eef7d34e7141dd3385515f45529a74e1e21d6e2efc6bacdc71f3642c4791

The output is that this container does not exist:

Error: No such container: 7db4eef7d34e7141dd3385515f45529a74e1e21d6e2efc6bacdc71f3642c4791

I just runned the first command and i have the following error:

Error response from daemon: Container ded56e800de5d945f1a69944c622a74ab89e0196a774eebce34b56c45603216b is not running
2021-04-15 09:29:40,810 [INFO] tlt.components.docker_handler.docker_handler: Stopping container.
Traceback (most recent call last):
File “/home/aqc/Envs/mrcnn/lib/python3.6/site-packages/docker/api/client.py”, line 259, in _raise_for_status
response.raise_for_status()
File “/home/aqc/Envs/mrcnn/lib/python3.6/site-packages/requests/models.py”, line 941, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 404 Client Error: Not Found for url: http+docker://localhost/v1.40/containers/ded56e800de5d945f1a69944c622a74ab89e0196a774eebce34b56c45603216b/stop

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File “/home/aqc/Envs/mrcnn/bin/tlt”, line 8, in
sys.exit(main())
File “/home/aqc/Envs/mrcnn/lib/python3.6/site-packages/tlt/entrypoint/entrypoint.py”, line 114, in main
args[1:]
File “/home/aqc/Envs/mrcnn/lib/python3.6/site-packages/tlt/components/instance_handler/local_instance.py”, line 263, in launch_command
docker_handler.run_container(command)
File “/home/aqc/Envs/mrcnn/lib/python3.6/site-packages/tlt/components/docker_handler/docker_handler.py”, line 276, in run_container
self.stop_container()
File “/home/aqc/Envs/mrcnn/lib/python3.6/site-packages/tlt/components/docker_handler/docker_handler.py”, line 283, in stop_container
self._container.stop()
File “/home/aqc/Envs/mrcnn/lib/python3.6/site-packages/docker/models/containers.py”, line 436, in stop
return self.client.api.stop(self.id, **kwargs)
File “/home/aqc/Envs/mrcnn/lib/python3.6/site-packages/docker/utils/decorators.py”, line 19, in wrapped
return f(self, resource_id, *args, **kwargs)
File “/home/aqc/Envs/mrcnn/lib/python3.6/site-packages/docker/api/container.py”, line 1167, in stop
self._raise_for_status(res)
File “/home/aqc/Envs/mrcnn/lib/python3.6/site-packages/docker/api/client.py”, line 261, in _raise_for_status
raise create_api_error_from_http_exception(e)
File “/home/aqc/Envs/mrcnn/lib/python3.6/site-packages/docker/errors.py”, line 31, in create_api_error_from_http_exception
raise cls(e, response=response, explanation=explanation)
docker.errors.NotFound: 404 Client Error: Not Found (“No such container: ded56e800de5d945f1a69944c622a74ab89e0196a774eebce34b56c45603216b”)

Which command?

!tlt mask_rcnn run bash $SPECS_DIR/download_and_preprocess_coco.sh $DATA_DOWNLOAD_DIR

Firstly, we should make sure you can run tlt command successfully in your host PC.
Can you run below command?
$ tlt info

$ tlt --help

tlt info:
dockers: [‘nvcr.io/nvidia/tlt-streamanalytics’, ‘nvcr.io/nvidia/tlt-pytorch’]
format_version: 1.0
tlt_version: 3.0
published_date: 02/02/2021

tlt --help
usage: tlt [-h]
{list,stop,info,augment,classification,detectnet_v2,dssd,emotionnet,faster_rcnn,fpenet,gazenet,gesturenet,heartratenet,intent_slot_classification,lprnet,mask_rcnn,punctuation_and_capitalization,question_answering,retinanet,speech_to_text,ssd,text_classification,tlt-converter,token_classification,unet,yolo_v3,yolo_v4}

Launcher for TLT

optional arguments:
-h, --help show this help message and exit

tasks:
{list,stop,info,augment,classification,detectnet_v2,dssd,emotionnet,faster_rcnn,fpenet,gazenet,gesturenet,heartratenet,intent_slot_classification,lprnet,mask_rcnn,punctuation_and_capitalization,question_answering,retinanet,speech_to_text,ssd,text_classification,tlt-converter,token_classification,unet,yolo_v3,yolo_v4}

How about below?
$ tlt ssd run ls

Error response from daemon: Container 30bd3a569a996692fa0c9a5fff81ef886983f06e135a6d4e79fb0c2e5b9e0c82 is not running
2021-04-15 09:43:37,669 [INFO] tlt.components.docker_handler.docker_handler: Stopping container.

How about $tlt list ?

============== ================== =========
container_id container_status command
============== ================== =========
============== ================== =========

According to TLT Launcher — Transfer Learning Toolkit 3.0 documentation, have you logined to the NGC docker registry ( nvcr.io ) ?
$ docker login nvcr.io

  1. Username: “$oauthtoken”
  2. Password: “YOUR_NGC_API_KEY”

Yes, i have my account logged:

Authenticating with existing credentials…
WARNING! Your password will be stored unencrypted in /home/aqc/.docker/config.json.
Configure a credential helper to remove this warning. See

Login Succeeded

However, when i run:
$ docker run --runtime=nvidia -it nvcr.io/nvidia/tlt-streamanalytics:v3.0-dp-py3 /bin/bash

I have the following error:
standard_init_linux.go:211: exec user process caused “exec format error”

I’m running all commands in Jetson Nano
$ uname - m
aarch64

Please let me know if you can run below command successfully.
$ docker run --runtime=nvidia -it nvcr.io/nvidia/tlt-streamanalytics:v3.0-dp-py3 /bin/bash

Where did you run the command, in x86 host PC or Jetson devices or others?

I think I know the reason now. Please note that the tlt command should run on host PC instead of Jetson devices.
When we run tlt training or trigger docker or run jupyter notebook, it is running on host PC.

After training, if we deploy the model and run inference, it can be host PC or Jetson devices.
More info, please see https://docs.nvidia.com/metropolis/TLT/tlt-user-guide/text/requirements_and_installation.html#

1 Like