docker.errors.ImageNotFound after follow "nvidia/tao/cv_samples:v1.4.1"

Please provide the following information when requesting support.

• Hardware (T4/V100/Xavier/Nano/etc) : GTX3090
• Network Type (Detectnet_v2/Faster_rcnn/Yolo_v4/LPRnet/Mask_rcnn/Classification/etc): Detectnet_v2
• TLT Version (Please run “tlt info --verbose” and share “docker_tag” here)

$ !tao info --verbose

Configuration of the TAO Toolkit Instance

dockers: 		
	nvidia/tao/tao-toolkit-tf: 			
		v3.22.05-tf1.15.5-py3: 				
			docker_registry: nvcr.io
			tasks: 
				1. augment
				2. bpnet
				3. classification
				4. dssd
				5. faster_rcnn
				6. emotionnet
				7. efficientdet
				8. fpenet
				9. gazenet
				10. gesturenet
				11. heartratenet
				12. lprnet
				13. mask_rcnn
				14. multitask_classification
				15. retinanet
				16. ssd
				17. unet
				18. yolo_v3
				19. yolo_v4
				20. yolo_v4_tiny
				21. converter
		v3.22.05-tf1.15.4-py3: 				
			docker_registry: nvcr.io
			tasks: 
				1. detectnet_v2
	nvidia/tao/tao-toolkit-pyt: 			
		v3.22.05-py3: 				
			docker_registry: nvcr.io
			tasks: 
				1. speech_to_text
				2. speech_to_text_citrinet
				3. speech_to_text_conformer
				4. action_recognition
				5. pointpillars
				6. pose_classification
				7. spectro_gen
				8. vocoder
		v3.21.11-py3: 				
			docker_registry: nvcr.io
			tasks: 
				1. text_classification
				2. question_answering
				3. token_classification
				4. intent_slot_classification
				5. punctuation_and_capitalization
	nvidia/tao/tao-toolkit-lm: 			
		v3.22.05-py3: 				
			docker_registry: nvcr.io
			tasks: 
				1. n_gram
format_version: 2.0
toolkit_version: 3.22.05
published_date: 05/25/2022

• Training spec file(If have, please share here): not applicable
• How to reproduce the issue ? (This is for errors. Please share the command line and the detailed log here.):
after follow the “nvidia/tao/cv_samples:v1.4.1” run

!tao detectnet_v2 dataset_convert \
                  -d $SPECS_DIR/detectnet_v2_tfrecords_kitti_trainval.txt \
                  -o $DATA_DOWNLOAD_DIR/tfrecords/kitti_trainval/kitti_trainval

Error Msg

Converting Tfrecords for kitti trainval dataset
2022-11-12 19:07:34,259 [INFO] root: Registry: ['nvcr.io']
2022-11-12 19:07:34,322 [INFO] tlt.components.instance_handler.local_instance: Running command in container: nvcr.io/nvidia/tao/tao-toolkit-tf:v3.22.05-tf1.15.4-py3
Traceback (most recent call last):
  File "/home/ben/miniconda3/envs/ai/lib/python3.10/site-packages/docker/api/client.py", line 259, in _raise_for_status
    response.raise_for_status()
  File "/home/ben/miniconda3/envs/ai/lib/python3.10/site-packages/requests/models.py", line 1021, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 404 Client Error: Not Found for url: http+docker://localhost/v1.41/images/sha256:f7d3f352793a5ca5870bab2420ee27dfe076cb76365911e7f40832efe37cd716/json

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/ben/miniconda3/envs/ai/bin/tao", line 8, in <module>
    sys.exit(main())
  File "/home/ben/miniconda3/envs/ai/lib/python3.10/site-packages/tlt/entrypoint/entrypoint.py", line 113, in main
    local_instance.launch_command(
  File "/home/ben/miniconda3/envs/ai/lib/python3.10/site-packages/tlt/components/instance_handler/local_instance.py", line 319, in launch_command
    docker_handler.run_container(command)
  File "/home/ben/miniconda3/envs/ai/lib/python3.10/site-packages/tlt/components/docker_handler/docker_handler.py", line 280, in run_container
    if not self._check_image_exists():
  File "/home/ben/miniconda3/envs/ai/lib/python3.10/site-packages/tlt/components/docker_handler/docker_handler.py", line 135, in _check_image_exists
    image_inspection_content = self._api_client.inspect_image(image.attrs["Id"])
  File "/home/ben/miniconda3/envs/ai/lib/python3.10/site-packages/docker/utils/decorators.py", line 19, in wrapped
    return f(self, resource_id, *args, **kwargs)
  File "/home/ben/miniconda3/envs/ai/lib/python3.10/site-packages/docker/api/image.py", line 245, in inspect_image
    return self._result(
  File "/home/ben/miniconda3/envs/ai/lib/python3.10/site-packages/docker/api/client.py", line 265, in _result
    self._raise_for_status(response)
  File "/home/ben/miniconda3/envs/ai/lib/python3.10/site-packages/docker/api/client.py", line 261, in _raise_for_status
    raise create_api_error_from_http_exception(e)
  File "/home/ben/miniconda3/envs/ai/lib/python3.10/site-packages/docker/errors.py", line 31, in create_api_error_from_http_exception
    raise cls(e, response=response, explanation=explanation)
docker.errors.ImageNotFound: 404 Client Error: Not Found ("no such image: sha256:f7d3f352793a5ca5870bab2420ee27dfe076cb76365911e7f40832efe37cd716: No such image: sha256:f7d3f352793a5ca5870bab2420ee27dfe076cb76365911e7f40832efe37cd716")

There is one almost identical issue in this forum. I can ensure that

  1. Successfully login docker login nvcr.io + able to pull image docker pull nvcr.io/nvidia/tao/tao-toolkit-tf:v3.21.11-tf1.15.4-py3

Can you open a terminal instead of notebook and run below successfully?
$ tao detectnet_v2 run /bin/bash

nope, the same error persist. here is the output

tao detectnet_v2 run /bin/bash
2022-11-13 12:44:53,771 [INFO] root: Registry: ['nvcr.io']
2022-11-13 12:44:53,834 [INFO] tlt.components.instance_handler.local_instance: Running command in container: nvcr.io/nvidia/tao/tao-toolkit-tf:v3.22.05-tf1.15.4-py3
Traceback (most recent call last):
  File "/home/ben/miniconda3/envs/ai/lib/python3.10/site-packages/docker/api/client.py", line 259, in _raise_for_status
    response.raise_for_status()
  File "/home/ben/miniconda3/envs/ai/lib/python3.10/site-packages/requests/models.py", line 1021, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 404 Client Error: Not Found for url: http+docker://localhost/v1.41/images/sha256:f7d3f352793a5ca5870bab2420ee27dfe076cb76365911e7f40832efe37cd716/json

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/ben/miniconda3/envs/ai/bin/tao", line 8, in <module>
    sys.exit(main())
  File "/home/ben/miniconda3/envs/ai/lib/python3.10/site-packages/tlt/entrypoint/entrypoint.py", line 113, in main
    local_instance.launch_command(
  File "/home/ben/miniconda3/envs/ai/lib/python3.10/site-packages/tlt/components/instance_handler/local_instance.py", line 319, in launch_command
    docker_handler.run_container(command)
  File "/home/ben/miniconda3/envs/ai/lib/python3.10/site-packages/tlt/components/docker_handler/docker_handler.py", line 280, in run_container
    if not self._check_image_exists():
  File "/home/ben/miniconda3/envs/ai/lib/python3.10/site-packages/tlt/components/docker_handler/docker_handler.py", line 135, in _check_image_exists
    image_inspection_content = self._api_client.inspect_image(image.attrs["Id"])
  File "/home/ben/miniconda3/envs/ai/lib/python3.10/site-packages/docker/utils/decorators.py", line 19, in wrapped
    return f(self, resource_id, *args, **kwargs)
  File "/home/ben/miniconda3/envs/ai/lib/python3.10/site-packages/docker/api/image.py", line 245, in inspect_image
    return self._result(
  File "/home/ben/miniconda3/envs/ai/lib/python3.10/site-packages/docker/api/client.py", line 265, in _result
    self._raise_for_status(response)
  File "/home/ben/miniconda3/envs/ai/lib/python3.10/site-packages/docker/api/client.py", line 261, in _raise_for_status
    raise create_api_error_from_http_exception(e)
  File "/home/ben/miniconda3/envs/ai/lib/python3.10/site-packages/docker/errors.py", line 31, in create_api_error_from_http_exception
    raise cls(e, response=response, explanation=explanation)
docker.errors.ImageNotFound: 404 Client Error: Not Found ("no such image: sha256:f7d3f352793a5ca5870bab2420ee27dfe076cb76365911e7f40832efe37cd716: No such image: sha256:f7d3f352793a5ca5870bab2420ee27dfe076cb76365911e7f40832efe37cd716")

Are you running in dgpu device or Jetson device?

dgpu

Is it a local machine or a cloud machine?

local machine

I configured with cgroup v2. not sure if that relate to this. it enables me to use --gpus all without sudo.
btw, my machine is able to run nvidia-smi via docker.

I search the error in TAO forum as below.
https://forums.developer.nvidia.com/search?q=%22404%20Client%20Error%3A%20Not%20Found%20for%20url%22%22%20%23intelligent-video-analytics%3Atao-toolkit%20order%3Alatest

Maybe it can help you.

More, can you provide more details as well before you run "tao detectnet_v2 run /bin/bash "?

Thank you. I will reinstall my machine from clean OS and see if problem persist, but could you recommend another way to do the toturial apart from install to host machine?

any docker images that includes all the working dependencies?

Yes, you can use docker command to login. This way will not use tao-launcher.

$ docker run --runtime=nvidia -it --rm --entrypoint "" nvcr.io/nvidia/tao/tao-toolkit-tf:v3.22.05-tf1.15.4-py3 /bin/bash

then, you can run command inside the docker.
For example,
# detectnet_v2 dataset_convert -d your_spec.txt -o your_result

Please note, inside the docker, it is not needed to use any “tao” before “detectnet_v2.”

1 Like

using dGPU, use --gpus all instead of --runtime=nvidia and mount all related volumes

docker run --gpus all -it --rm --entrypoint "" -v "/home/ben/cv_samples_vv1.4.1":"/workspace/tao-experiments" -v /home/cv_samples_vv1.4.1/detectnet_v2/specs:/workspace/tao-experiments/detectnet_v2/specs nvcr.io/nvidia/tao/tao-toolkit-tf:v3.22.05-tf1.15.4-py3 /bin/bash

Then,

detectnet_v2 dataset_convert -d /workspace/tao-experiments/detectnet_v2/specs/detectnet_v2_tfrecords_kitti_trainval.txt -o /workspace/tao-experiments/data/tfrecords/kitti_trainval/kitti_trainval

Work perfectly : )

1 Like

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.