Please provide the following information when requesting support.
on host i am able to run the training successfully but when i tried to run the training docker container I am getting json errors. I am using nvcr.io/nvidia/tao/tao-toolkit:4.0.0-tf1.15.5 as base image but i understand the problem might coming from mounts.json this is how i used mounts in docker
Define the file path for mounts file.
mounts_file=“/root/.tao_mounts.json”
Define the dictionary with the mapped drives.
drive_map=‘{“Mounts”: [
{
“source”: "’“${LOCAL_PROJECT_DIR}”‘",
“destination”: “/use/src/tao-experiments”
},
{
“source”: "’“${LOCAL_SPECS_DIR}”‘",
“destination”: "’“${SPECS_DIR}”‘"
}
]}’
and this is the error
tfrecords generation
2023-05-04 03:20:05,403 [INFO] root: Registry: [‘nvcr.io’]
2023-05-04 03:20:05,445 [INFO] tlt.components.instance_handler.local_instance: Running command in container: nvcr.io/nvidia/tao/tao-toolkit:4.0.0-tf1.15.5
Traceback (most recent call last):
File “/usr/local/bin/tao”, line 8, in
sys.exit(main())
File “/usr/local/lib/python3.6/dist-packages/tlt/entrypoint/tao.py”, line 116, in main
args[1:]
File “/usr/local/lib/python3.6/dist-packages/tlt/components/instance_handler/local_instance.py”, line 319, in launch_command
docker_handler.run_container(command)
File “/usr/local/lib/python3.6/dist-packages/tlt/components/docker_handler/docker_handler.py”, line 284, in run_container
mount_data, env_vars, docker_options = self._get_mount_env_data()
File “/usr/local/lib/python3.6/dist-packages/tlt/components/docker_handler/docker_handler.py”, line 92, in _get_mount_env_data
data = self._load_mounts_file(self._docker_mount_file)
File “/usr/local/lib/python3.6/dist-packages/tlt/components/docker_handler/docker_handler.py”, line 77, in _load_mounts_file
data = json.load(mfile)
File “/usr/lib/python3.6/json/init.py”, line 299, in load
parse_constant=parse_constant, object_pairs_hook=object_pairs_hook, **kw)
File “/usr/lib/python3.6/json/init.py”, line 354, in loads
return _default_decoder.decode(s)
File “/usr/lib/python3.6/json/decoder.py”, line 339, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File “/usr/lib/python3.6/json/decoder.py”, line 357, in raw_decode
raise JSONDecodeError(“Expecting value”, s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 7 column 19 (char 146)
2023-05-04 03:20:05,834 [INFO] root: Registry: [‘nvcr.io’]
2023-05-04 03:20:05,875 [INFO] tlt.components.instance_handler.local_instance: Running command in container: nvcr.io/nvidia/tao/tao-toolkit:4.0.0-tf1.15.5
Traceback (most recent call last):
could you please tell me how to use mounts properly when training in docker container.
Thanks in advance.