Is there some spacial things about bpnet? A question about "tlt bpnet dataset_convert " for bpnet

I have run several classification examples without errors. When I tested the bpnet example, I encountered these information:


Generate TFRecords for training dataset

!tlt bpnet dataset_convert
-m ‘train’
-o $DATA_DIR/train
–generate_masks
–dataset_spec $DATA_POSE_SPECS_DIR/coco_spec.json


Traceback (most recent call last):
File “/usr/local/bin/tlt”, line 8, in
sys.exit(main())
File “/usr/local/lib/python3.6/dist-packages/tlt/entrypoint/entrypoint.py”, line 114, in main
args[1:]
File “/usr/local/lib/python3.6/dist-packages/tlt/components/instance_handler/local_instance.py”, line 258, in launch_command
docker_logged_in(required_registry=self.task_map[task].docker_registry)
File “/usr/local/lib/python3.6/dist-packages/tlt/components/instance_handler/utils.py”, line 130, in docker_logged_in
data = load_config_file(docker_config)
File “/usr/local/lib/python3.6/dist-packages/tlt/components/instance_handler/utils.py”, line 66, in load_config_file
“No file found at: {}. Did you run docker login?”.format(config_path)
AssertionError: Config path must be a valid unix path. No file found at: /root/.docker/config.json. Did you run docker login?


It seems that I have not the file “/root/.docker/config.json”. It should be in the docker or in the master computer? I created one in the docker by the command “touch”, but I encountered another error:


Generate TFRecords for training dataset

!tlt bpnet dataset_convert
-m ‘train’
-o $DATA_DIR/train
–generate_masks
–dataset_spec $DATA_POSE_SPECS_DIR/coco_spec.json


Traceback (most recent call last):
File “/usr/local/bin/tlt”, line 8, in
sys.exit(main())
File “/usr/local/lib/python3.6/dist-packages/tlt/entrypoint/entrypoint.py”, line 114, in main
args[1:]
File “/usr/local/lib/python3.6/dist-packages/tlt/components/instance_handler/local_instance.py”, line 258, in launch_command
docker_logged_in(required_registry=self.task_map[task].docker_registry)
File “/usr/local/lib/python3.6/dist-packages/tlt/components/instance_handler/utils.py”, line 130, in docker_logged_in
data = load_config_file(docker_config)
File “/usr/local/lib/python3.6/dist-packages/tlt/components/instance_handler/utils.py”, line 71, in load_config_file
data = json.load(cfile)
File “/usr/lib/python3.6/json/init.py”, line 299, in load
parse_constant=parse_constant, object_pairs_hook=object_pairs_hook, **kw)
File “/usr/lib/python3.6/json/init.py”, line 354, in loads
return _default_decoder.decode(s)
File “/usr/lib/python3.6/json/decoder.py”, line 339, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File “/usr/lib/python3.6/json/decoder.py”, line 357, in raw_decode
raise JSONDecodeError(“Expecting value”, s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)


How to fix it?

See Launch tlt detectnet_v2 - #2 by Morganh
For AssertionError: Config path must be a valid unix path. No file found at: /root/.docker/config.json , please consider below solution.
You need to run docker login nvcr.io in your host pc.

If I run docker login nvcr.io in my host pc, I encountered these information:


Traceback (most recent call last):
File “/usr/local/bin/tlt”, line 8, in
sys.exit(main())
File “/usr/local/lib/python3.6/dist-packages/tlt/entrypoint/entrypoint.py”, line 114, in main
args[1:]
File “/usr/local/lib/python3.6/dist-packages/tlt/components/instance_handler/local_instance.py”, line 258, in launch_command
docker_logged_in(required_registry=self.task_map[task].docker_registry)
File “/usr/local/lib/python3.6/dist-packages/tlt/components/instance_handler/utils.py”, line 130, in docker_logged_in
data = load_config_file(docker_config)
File “/usr/local/lib/python3.6/dist-packages/tlt/components/instance_handler/utils.py”, line 66, in load_config_file
“No file found at: {}. Did you run docker login?”.format(config_path)
AssertionError: Config path must be a valid unix path. No file found at: /root/.docker/config.json. Did you run docker login?


After I had run docker login nvcr.io in my host pc, I run docker login nvcr.io in docker, and encountered these information:

Traceback (most recent call last):
File “/usr/local/lib/python3.6/dist-packages/urllib3/connectionpool.py”, line 677, in urlopen
chunked=chunked,
File “/usr/local/lib/python3.6/dist-packages/urllib3/connectionpool.py”, line 392, in _make_request
conn.request(method, url, **httplib_request_kw)
File “/usr/lib/python3.6/http/client.py”, line 1281, in request
self._send_request(method, url, body, headers, encode_chunked)
File “/usr/lib/python3.6/http/client.py”, line 1327, in _send_request
self.endheaders(body, encode_chunked=encode_chunked)
File “/usr/lib/python3.6/http/client.py”, line 1276, in endheaders
self._send_output(message_body, encode_chunked=encode_chunked)
File “/usr/lib/python3.6/http/client.py”, line 1042, in _send_output
self.send(msg)
File “/usr/lib/python3.6/http/client.py”, line 980, in send
self.connect()
File “/usr/local/lib/python3.6/dist-packages/docker/transport/unixconn.py”, line 43, in connect
sock.connect(self.unix_socket)
FileNotFoundError: [Errno 2] No such file or directory

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File “/usr/local/lib/python3.6/dist-packages/requests/adapters.py”, line 449, in send
timeout=timeout
File “/usr/local/lib/python3.6/dist-packages/urllib3/connectionpool.py”, line 727, in urlopen
method, url, error=e, _pool=self, _stacktrace=sys.exc_info()[2]
File “/usr/local/lib/python3.6/dist-packages/urllib3/util/retry.py”, line 403, in increment
raise six.reraise(type(error), error, _stacktrace)
File “/usr/local/lib/python3.6/dist-packages/urllib3/packages/six.py”, line 734, in reraise
raise value.with_traceback(tb)
File “/usr/local/lib/python3.6/dist-packages/urllib3/connectionpool.py”, line 677, in urlopen
chunked=chunked,
File “/usr/local/lib/python3.6/dist-packages/urllib3/connectionpool.py”, line 392, in _make_request
conn.request(method, url, **httplib_request_kw)
File “/usr/lib/python3.6/http/client.py”, line 1281, in request
self._send_request(method, url, body, headers, encode_chunked)
File “/usr/lib/python3.6/http/client.py”, line 1327, in _send_request
self.endheaders(body, encode_chunked=encode_chunked)
File “/usr/lib/python3.6/http/client.py”, line 1276, in endheaders
self._send_output(message_body, encode_chunked=encode_chunked)
File “/usr/lib/python3.6/http/client.py”, line 1042, in _send_output
self.send(msg)
File “/usr/lib/python3.6/http/client.py”, line 980, in send
self.connect()
File “/usr/local/lib/python3.6/dist-packages/docker/transport/unixconn.py”, line 43, in connect
sock.connect(self.unix_socket)
urllib3.exceptions.ProtocolError: (‘Connection aborted.’, FileNotFoundError(2, ‘No such file or directory’))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File “/usr/local/lib/python3.6/dist-packages/docker/api/client.py”, line 205, in _retrieve_server_version
return self.version(api_version=False)[“ApiVersion”]
File “/usr/local/lib/python3.6/dist-packages/docker/api/daemon.py”, line 181, in version
return self._result(self._get(url), json=True)
File “/usr/local/lib/python3.6/dist-packages/docker/utils/decorators.py”, line 46, in inner
return f(self, *args, **kwargs)
File “/usr/local/lib/python3.6/dist-packages/docker/api/client.py”, line 228, in _get
return self.get(url, **self._set_request_timeout(kwargs))
File “/usr/local/lib/python3.6/dist-packages/requests/sessions.py”, line 543, in get
return self.request(‘GET’, url, **kwargs)
File “/usr/local/lib/python3.6/dist-packages/requests/sessions.py”, line 530, in request
resp = self.send(prep, **send_kwargs)
File “/usr/local/lib/python3.6/dist-packages/requests/sessions.py”, line 643, in send
r = adapter.send(request, **kwargs)
File “/usr/local/lib/python3.6/dist-packages/requests/adapters.py”, line 498, in send
raise ConnectionError(err, request=request)
requests.exceptions.ConnectionError: (‘Connection aborted.’, FileNotFoundError(2, ‘No such file or directory’))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File “/usr/local/bin/tlt”, line 8, in
sys.exit(main())
File “/usr/local/lib/python3.6/dist-packages/tlt/entrypoint/entrypoint.py”, line 114, in main
args[1:]
File “/usr/local/lib/python3.6/dist-packages/tlt/components/instance_handler/local_instance.py”, line 259, in launch_command
docker_handler = self.handler_map[
File “/usr/local/lib/python3.6/dist-packages/tlt/components/instance_handler/local_instance.py”, line 114, in handler_map
docker_mount_file=os.getenv(“LAUNCHER_MOUNTS”, DOCKER_MOUNT_FILE)
File “/usr/local/lib/python3.6/dist-packages/tlt/components/docker_handler/docker_handler.py”, line 47, in init
self._docker_client = docker.from_env()
File “/usr/local/lib/python3.6/dist-packages/docker/client.py”, line 85, in from_env
timeout=timeout, version=version, **kwargs_from_env(**kwargs)
File “/usr/local/lib/python3.6/dist-packages/docker/client.py”, line 40, in init
self.api = APIClient(*args, **kwargs)
File “/usr/local/lib/python3.6/dist-packages/docker/api/client.py”, line 188, in init
self._version = self._retrieve_server_version()
File “/usr/local/lib/python3.6/dist-packages/docker/api/client.py”, line 213, in _retrieve_server_version
‘Error while fetching server API version: {0}’.format(e)
docker.errors.DockerException: Error while fetching server API version: (‘Connection aborted.’, FileNotFoundError(2, ‘No such file or directory’))


What is the error? How to fix it?

How about running $ sudo docker login nvcr.io in host pc ?

If not able to run with sudo, please see TLT Quick Start Guide — Transfer Learning Toolkit 3.0 documentation
Once you have installed docker-ce, follow the post-installation steps to ensure that the docker can be run without sudo .

It seems too difficult to set the rootless docker. Any easy method to run the bpnet sample? I try the docker for a long time and have to give up. But I still hope to make use of this bpnet sample. Any other way?

Firstly, can you run below command successfully with root access?
$ docker run hello-world

More,
Are you triggering tlt docker based on one docker?
In this case, please
add -v /var/run/docker.sock:/var/run/docker.sock

See Tlt augment not working