Please provide the following information when requesting support.
• Hardware ubuntu18.04(x86)–1070ti
• Network Type Classification resnet18
• TLT Version toolkit_version: 3.21.08
Configuration of the TAO Toolkit Instance
dockers: [‘nvidia/tao/tao-toolkit-tf’, ‘nvidia/tao/tao-toolkit-pyt’, ‘nvidia/tao/tao-toolkit-lm’]
format_version: 1.0
toolkit_version: 3.21.08
published_date: 08/17/2021
tao classification train -e ./specs/classification_spec.cfg -r ./ -k xxxx
Terminal command execution error
FileNotFoundError: [Errno 2] No such file or directory: ‘./specs/classification_spec.cfg’
jupyter command error
FileNotFoundError: [Errno 2] No such file or directory: ‘/home/ncp/tao/cv_samples_v1.2.0/classification/faces/train’
2021-11-23 22:22:06,234 [INFO] tlt.components.docker_handler.docker_handler: Stopping container.
I am sure that my configuration file path and training set path are okay, but it always prompts that various files cannot be found
train_dataset_path: “/home/ncp/tao/cv_samples_v1.2.0/classification/faces/train”
val_dataset_path: “/home/ncp/tao/cv_samples_v1.2.0/classification/faces/val”
pretrained_model_path: “/home/ncp/tao/cv_samples_v1.2.0/classification/resnet_18.hdf5”
Morganh
November 23, 2021, 4:33pm
2
Please note that all the path in your command is the path inside the docker.
So, you need to set the path according to your ~/.tao_mounts.json.
tao classification train -e <
the path you set in destination>
-r ./ -k xxxx
My ~/.tao_mounts.json path settings are the same, they are all absolute paths, but I didn’t find any problems.
I tried this, but it didn’t work, and the same error was displayed.
Morganh
November 24, 2021, 8:30am
6
I suggest you to modify your tao_mounts.json.
Only keep the 2nd “source and destination” .
“source” : “/home/ncp/tao/cv_samples_v1.2.0/classfication”,
“destination” : “/home/ncp/tao/cv_samples_v1.2.0/classfication”
You mean, just keep one setting, right? , It’s better I just keep the second one.
So may I run in the terminal or jupyter?
Morganh
November 24, 2021, 8:33am
8
Yes, I am afraid 1st setting conflicts with 2nd setting. So just keep 2nd.
Try terminal firstly.
ok thanks,I will test it at night
But I still have a question about this setting. For example, what is the path written after the -r parameter for? Is it the default weight file or the training data?
Morganh
November 24, 2021, 8:39am
10
It is the result folder. Usually it stores the tlt file which is generated during training.
I tested it, and this time it prompts that the training file cannot be found?
FileNotFoundError: [Errno 2] No such file or directory:‘/home/ncp/tao/cv_samples_v1.2.0/classification/faces/train’
Morganh
November 24, 2021, 2:03pm
12
Please check all the path in your training spec file.
For above error, actually you can check the file with below command.
$ tao classification run ls /home/ncp/tao/cv_samples_v1.2.0/classification/faces/train
I suggest you to login into the docker to debug any issue.
$ tao classification run /bin/bash
Morganh
November 24, 2021, 2:13pm
14
Yes, currently you already login the docker.
You can check whether a file is available.
Or you can run training,etc.
$ classification train xxx
classification train -e /home/ncp/tao/cv_samples_v1.2.0/classification/specs/classification_spec.cfg -r ./ -k nvidia_tlt error
FileNotFoundError: [Errno 2] No such file or directory: ‘/home/ncp/tao/cv_samples_v1.2.0/classification/faces/val’
I doubt whether it is necessary to set the verification file and test file path in tao.json?
Morganh
November 24, 2021, 2:20pm
17
No, see TAO Toolkit Launcher — TAO Toolkit 3.22.05 documentation , tao_mounts.json is used to map local folder to docker.
I have no name!@16d3728c815e :/workspace$ train -e
bash: train: command not found
Morganh
November 24, 2021, 2:31pm
19
Please use
$ classification train xxx xxx