Tao toolkit container timeout from tao CLI

Please provide the following information when requesting support.

• Hardware (T4/V100/Xavier/Nano/etc) : x86_64 GPU machine
• Network Type (Detectnet_v2/Faster_rcnn/Yolo_v4/LPRnet/Mask_rcnn/Classification/etc) : fpenet
• TLT Version (Please run “tlt info --verbose” and share “docker_tag” here) : Command ‘tlt’ not found**
• Training spec file(If have, please share here): NA
• How to reproduce the issue ? (This is for errors. Please share the command line and the detailed log here.)

Encounter timeout error when running v4.0.0 fpenet facial landmark estimation jupyter notebook example.

Invoking docker directly following the instructions here won’t cause a problem Working With the Containers - NVIDIA Docs
however runing tao launcher CLI always results in a timeout error. Having tried setting DOCKER_CLIENT_TIMEOUT and COMPOSE_HTTP_TIMEOUT but it didnt work

tao fpenet dataset_convert -e $SPECS_DIR/$DATASET_CONFIG

2023-06-28 12:22:12,802 [INFO] root: Registry: ['nvcr.io']
2023-06-28 12:22:13,009 [INFO] tlt.components.instance_handler.local_instance: Running command in container: nvcr.io/nvidia/tao/tao-toolkit:4.0.0-tf1.15.5
Traceback (most recent call last):
  File "/home/wanghanpeng/anaconda3/envs/launcher/lib/python3.6/site-packages/urllib3/connectionpool.py", line 466, in _make_request
    six.raise_from(e, None)
  File "<string>", line 3, in raise_from
  File "/home/wanghanpeng/anaconda3/envs/launcher/lib/python3.6/site-packages/urllib3/connectionpool.py", line 461, in _make_request
    httplib_response = conn.getresponse()
  File "/home/wanghanpeng/anaconda3/envs/launcher/lib/python3.6/http/client.py", line 1346, in getresponse
    response.begin()
  File "/home/wanghanpeng/anaconda3/envs/launcher/lib/python3.6/http/client.py", line 307, in begin
    version, status, reason = self._read_status()
  File "/home/wanghanpeng/anaconda3/envs/launcher/lib/python3.6/http/client.py", line 268, in _read_status
    line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
  File "/home/wanghanpeng/anaconda3/envs/launcher/lib/python3.6/socket.py", line 586, in readinto
    return self._sock.recv_into(b)
socket.timeout: timed out

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/wanghanpeng/anaconda3/envs/launcher/lib/python3.6/site-packages/requests/adapters.py", line 450, in send
    timeout=timeout
  File "/home/wanghanpeng/anaconda3/envs/launcher/lib/python3.6/site-packages/urllib3/connectionpool.py", line 799, in urlopen
    method, url, error=e, _pool=self, _stacktrace=sys.exc_info()[2]
...
    r = adapter.send(request, **kwargs)
  File "/home/wanghanpeng/anaconda3/envs/launcher/lib/python3.6/site-packages/requests/adapters.py", line 532, in send
    raise ReadTimeout(e, request=request)
requests.exceptions.ReadTimeout: UnixHTTPConnectionPool(host='localhost', port=None): Read timed out. (read timeout=60)

appreciate any help

How about
$ tao fpenet run /bin/bash

reports the same error.

I switched to a new machine and everything works fine.

There is no update from you for a period, assuming this is not an issue anymore. Hence we are closing this topic. If need further support, please open a new one. Thanks

So, you can compare the two machines. Also, please check python environment mentioned in TAO Toolkit Quick Start Guide - NVIDIA Docs

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.