Tao toolkit Error while fetching server API version

Please provide the following information when requesting support.

• Hardware (T4/V100/Xavier/Nano/etc)
geforce 3090
• Network Type (Detectnet_v2/Faster_rcnn/Yolo_v4/LPRnet/Mask_rcnn/Classification/etc)
Detectnet_v2
• TLT Version (Please run “tlt info --verbose” and share “docker_tag” here)

tao info --verbose
Configuration of the TAO Toolkit Instance

dockers: 		
	nvidia/tao/tao-toolkit: 			
		4.0.0-tf2.9.1: 				
			docker_registry: nvcr.io
			tasks: 
				1. classification_tf2
				2. efficientdet_tf2
		4.0.0-tf1.15.5: 				
			docker_registry: nvcr.io
			tasks: 
				1. augment
				2. bpnet
				3. classification_tf1
				4. detectnet_v2
				5. dssd
				6. emotionnet
				7. efficientdet_tf1
				8. faster_rcnn
				9. fpenet
				10. gazenet
				11. gesturenet
				12. heartratenet
				13. lprnet
				14. mask_rcnn
				15. multitask_classification
				16. retinanet
				17. ssd
				18. unet
				19. yolo_v3
				20. yolo_v4
				21. yolo_v4_tiny
				22. converter
		4.0.1-tf1.15.5: 				
			docker_registry: nvcr.io
			tasks: 
				1. mask_rcnn
				2. unet
		4.0.0-pyt: 				
			docker_registry: nvcr.io
			tasks: 
				1. action_recognition
				2. deformable_detr
				3. segformer
				4. re_identification
				5. pointpillars
				6. pose_classification
				7. n_gram
				8. speech_to_text
				9. speech_to_text_citrinet
				10. speech_to_text_conformer
				11. spectro_gen
				12. vocoder
				13. text_classification
				14. question_answering
				15. token_classification
				16. intent_slot_classification
				17. punctuation_and_capitalization
format_version: 2.0
toolkit_version: 4.0.1
published_date: 03/06/2023

I am trying to run detectnet_v2/detectnet_v2.ipynb jupyter notebook. I am getting the following error:

!tao detectnet_v2 train -e $SPECS_DIR/detectnet_v2_train_resnet18_kitti.txt \
                        -r $USER_EXPERIMENT_DIR/experiment_dir_unpruned \
                        -k $KEY \
                        -n resnet18_detector \
                        --gpus $NUM_GPUS

docker.errors.DockerException: Error while fetching server API version: request() got an unexpected keyword argument 'chunked'

Running this command:

docker run -it --rm --gpus all \
-v /var/run/docker.sock:/var/run/docker.sock \
nvcr.io/nvidia/tao/tao-toolkit:4.0.0-tf1.15.5

Gives

docker: Error response from daemon: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error running hook #0: error running hook: exit status 1, stdout: , stderr: Auto-detected mode as 'legacy'
nvidia-container-cli: initialization error: load library failed: libnvidia-ml.so.1: cannot open shared object file: no such file or directory: unknown.

I have docker up and running, logged in docker login nvcr.io not sure what the error means, also tao toolkit instructions are not clear.

All solutions say to:

add `-v /var/run/docker.sock:/var/run/docker.sock`
Reference: [Run TLT inside docker - #8 by Morganh ](https://forums.developer.nvidia.com/t/run-tlt-inside-docker/181992/8)

Where exactly? because is that nowhere in the jupyter notebook nor the documentation?
Is there anyway to no use docker and run everything locally?

Can someone please explain how to actually use docker for tao? What are the steps to follow before running the jupyter notebooks because I followed all the steps in the getting started section of the documentation and it does not work

Normally, users will not run into above error.
Do you remember what steps have been done before you trigger notebook?

What is the result of running nvidia-smi ?
Did you meet the software requirement mentioned in TAO Toolkit Quick Start Guide - NVIDIA Docs

I follow the steps in the quickstart TAO Toolkit Quick Start Guide - NVIDIA Docs

image

Yes, I should because 2 months ago I was able to train lprnet jupyter notebook. Also I was trying to reinstall everything but I don’t see instructions for most of the requirements listed such nvidia-docker2,nvidia-container-runtime,nvidia-driver,docker-API

Please update nvidia-driver firstly

Uninstall:  sudo apt purge nvidia-driver-515
                sudo apt autoremove
               sudo apt autoclean
Install:    sudo apt install nvidia-driver-525

Then check if below works.
$ tao detectnet_v2 run /bin/bash

Updated the driver still:

$  tao detectnet_v2 run /bin/bash
2023-05-26 12:47:40,605 [INFO] root: Registry: ['nvcr.io']
Traceback (most recent call last):
  File "/home/ff/.local/lib/python3.8/site-packages/docker/api/client.py", line 205, in _retrieve_server_version
    return self.version(api_version=False)["ApiVersion"]
  File "/home/ff/.local/lib/python3.8/site-packages/docker/api/daemon.py", line 181, in version
    return self._result(self._get(url), json=True)
  File "/home/ff/.local/lib/python3.8/site-packages/docker/utils/decorators.py", line 46, in inner
    return f(self, *args, **kwargs)
  File "/home/ff/.local/lib/python3.8/site-packages/docker/api/client.py", line 228, in _get
    return self.get(url, **self._set_request_timeout(kwargs))
  File "/home/ff/.local/lib/python3.8/site-packages/requests/sessions.py", line 602, in get
    return self.request("GET", url, **kwargs)
  File "/home/ff/.local/lib/python3.8/site-packages/requests/sessions.py", line 589, in request
    resp = self.send(prep, **send_kwargs)
  File "/home/ff/.local/lib/python3.8/site-packages/requests/sessions.py", line 703, in send
    r = adapter.send(request, **kwargs)
  File "/home/ff/.local/lib/python3.8/site-packages/requests/adapters.py", line 486, in send
    resp = conn.urlopen(
  File "/home/ff/.local/lib/python3.8/site-packages/urllib3/connectionpool.py", line 790, in urlopen
    response = self._make_request(
  File "/home/ff/.local/lib/python3.8/site-packages/urllib3/connectionpool.py", line 496, in _make_request
    conn.request(
TypeError: request() got an unexpected keyword argument 'chunked'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/ff/.local/bin/tao", line 8, in <module>
    sys.exit(main())
  File "/home/ff/.local/lib/python3.8/site-packages/tlt/entrypoint/tao.py", line 114, in main
    instance.launch_command(
  File "/home/ff/.local/lib/python3.8/site-packages/tlt/components/instance_handler/local_instance.py", line 297, in launch_command
    docker_handler = self.handler_map[
  File "/home/ff/.local/lib/python3.8/site-packages/tlt/components/instance_handler/local_instance.py", line 147, in handler_map
    handler_map[handler_key] = DockerHandler(
  File "/home/ff/.local/lib/python3.8/site-packages/tlt/components/docker_handler/docker_handler.py", line 62, in __init__
    self._docker_client = docker.from_env()
  File "/home/ff/.local/lib/python3.8/site-packages/docker/client.py", line 84, in from_env
    return cls(
  File "/home/ff/.local/lib/python3.8/site-packages/docker/client.py", line 40, in __init__
    self.api = APIClient(*args, **kwargs)
  File "/home/ff/.local/lib/python3.8/site-packages/docker/api/client.py", line 188, in __init__
    self._version = self._retrieve_server_version()
  File "/home/ff/.local/lib/python3.8/site-packages/docker/api/client.py", line 212, in _retrieve_server_version
    raise DockerException(
docker.errors.DockerException: Error while fetching server API version: request() got an unexpected keyword argument 'chunked'

Can you share your ~/.tao_mounts.json ?

$ cat ~/.tao_mounts.json
{
    "Mounts": [
        {
            "source": "/data/tlt-experiments",
            "destination": "/workspace/tao-experiments"
        },
        {
            "source": "/home/ff/cv_samples_v1.4.0/lprnet/specs",
            "destination": "/workspace/tao-experiments/lprnet/specs"
        }
    ],
    "DockerOptions": {
        "user": "1000:1000"
    }
}

Where did you trigger above command? Are you running it in a docker?

I ran it from the terminal, also tried from the jupyter notebook with same result

Is it inside a docker? I am asking this because I am going to check if you are running TAO inside a docker.

no

In the terminal, how about running below?
$ docker run --runtime=nvidia -it --rm nvcr.io/nvidia/tao/tao-toolkit:4.0.1-tf1.15.5 /bin/bash

$  docker run --runtime=nvidia -it --rm nvcr.io/nvidia/tao/tao-toolkit:4.0.1-tf1.15.5 /bin/bash
Unable to find image 'nvcr.io/nvidia/tao/tao-toolkit:4.0.1-tf1.15.5' locally
4.0.1-tf1.15.5: Pulling from nvidia/tao/tao-toolkit
675920708c8b: Pulling fs layer 
6ee225430193: Pulling fs layer 
8b1a583c6b12: Pulling fs layer 
7302be10edc2: Pulling fs layer 
5bf9d046b7e1: Pulling fs layer 
b4f5e997e3c2: Pulling fs layer 
d0e445bfbc9d: Pulling fs layer 
a86d67d9ff4b: Pulling fs layer 
51d596db94d9: Waiting 
a8bdaac63bdf: Waiting 
299ca652228e: Waiting 
7302be10edc2: Waiting 
5bf9d046b7e1: Waiting 
4f4fb700ef54: Pulling fs layer 
b4f5e997e3c2: Waiting 
a86d67d9ff4b: Waiting 
300931cbac1c: Waiting 
4b5236affaa5: Waiting 
427005135432: Waiting 
db2e66266fb9: Waiting 
4f01fafea34b: Waiting 
dcd53fb1bc56: Waiting 
01d17b838f45: Waiting 
91ed80f2c9c0: Waiting 
8bbc4a6c9e0e: Waiting 
88d75e503364: Waiting 
79e1b0752753: Waiting 
39b9dc76c852: Waiting 
c93ae07d6479: Waiting 
90c95acba3a5: Pulling fs layer 
d305571fcb10: Waiting 
a39616f40a21: Waiting 
e78c2e7e20de: Waiting 
61dd8b90f868: Waiting 
7b78cb3b16e3: Pulling fs layer 
b9b686cc0f46: Waiting 
3312306bb639: Pull complete 
2ff959eef820: Pull complete 
d17505503e54: Pull complete 
d9a8f5b4c864: Pull complete 
dc35747d4921: Pull complete 
827af6bf44a0: Pull complete 
a550df7c26ae: Pull complete 
15e95920d5ab: Pull complete 
a7f784490a4c: Pull complete 
c70db7f8e25e: Pull complete 
0a3ac70b7cac: Pull complete 
82d0962c0a43: Pull complete 
7675ac28f195: Pull complete 
ddf509749602: Pull complete 
16176cc52c04: Pull complete 
9511b59547c8: Pull complete 
1e9fc48bf7ac: Pull complete 
aa84ef787043: Pull complete 
03ea23afdde5: Pull complete 
7932ba9caa30: Pull complete 
ee3ce74fb2f7: Pull complete 
066147f13cc6: Pull complete 
e81c67e17b9c: Pull complete 
Digest: sha256:eae96df1b040f1d1c9a8548c0e6954d2eb3ccf5671fca9a7ab68a26d6bc08b85
Status: Downloaded newer image for nvcr.io/nvidia/tao/tao-toolkit:4.0.1-tf1.15.5

==============================
=== TAO Toolkit TensorFlow ===
==============================

NVIDIA Release 4.0.1-TensorFlow (build )
TAO Toolkit Version 4.0.1

Various files include modifications (c) NVIDIA CORPORATION & AFFILIATES.  All rights reserved.

This container image and its contents are governed by the TAO Toolkit End User License Agreement.
By pulling and using the container, you accept the terms and conditions of this license:
https://developer.nvidia.com/tao-toolkit-software-license-agreement

NOTE: The SHMEM allocation limit is set to the default of 64MB.  This may be
   insufficient for TAO Toolkit.  NVIDIA recommends the use of the following flags:
   docker run --gpus all --ipc=host --ulimit memlock=-1 --ulimit stack=67108864 ...

root@60cb7c26bd55:/workspace# tao detectnet_v2 run /bin/bash
bash: tao: command not found
root@60cb7c26bd55:/workspace#

When run inside the docker, you can run without tao. Run as below.
root@60cb7c26bd55:/workspace# detectnet_v2 run /bin/bash

So, you can run with docker run.

For the error when you run tao launcher, please double check again in TAO Toolkit Quick Start Guide - NVIDIA Docs.

Okay, I have just figured it out.
I activated the conda environment “launcher” and inside I ran

bash setup/quickstart_launcher.sh --upgrade

and the tao launcher started working…
However, when I ran it on my base environment I get the error:
docker.errors.DockerException: Error while fetching server API version: request() got an unexpected keyword argument 'chunked'
I would like to uninstall tao from my base environement. How can I do that?

How did you install tao previously?

I received this, facing the same issue as in this post, though I installed TAO for the first time.

docker: Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?.
See 'docker run --help'.

Hi, make sure docker is running:

sudo service docker status

If the error persists, uninstall docker, reboot the and install it again. Remember to always login using:
docker login nvcr.io

One last thing, I was getting a similar error when I installed docker Desktop. Docker Desktop says it comes with Docker engine but for some reason it will not install it and the service wasn’t there. So I just installed Docker engine from: Install Docker Engine on Ubuntu | Docker Documentation

Hi, just a reboot worked for me, did not require reinstalling.

Also, I solved the error TAO toolkit Error while fetching server API version: request() got an unexpected keyword argument 'chunked' after I used a conda environment with python 3.6.13, instead of a Python virtualenv.

1 Like

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.