Please provide the following information when requesting support.
• Hardware (RGX 3080)
• Network Type (Classification)
• TLT Version format_version: 2.0
toolkit_version: 3.22.05
published_date: 05/25/2022()
• Training spec file
• How to reproduce the issue ?
When I run “!tao classification train -e $LOCAL_SPECS_DIR -r $LOCAL_EXPERIMENT_DIR/classification -k $KEY”
The container stops with “google.protobuf.text_format.ParseError: 60:3 : ’ }': Couldn’t parse float: }”
Looking at my experiment spec file I changed the adam optimizer epsilon from ‘1e-7’ to ‘0.00000001’, but that did not change the ParseError.
I have completely rebuilt the training_spec document. The problem seems to have occurred when I made it a .json, at which point sections of the text turn red (in MS Visual Studio). I have therefore made it a .cfg, which makes the text revert to white . I have also gone back through the Jupyter notebook and changed all mentions of training_spec.json to training_spec.cfg.
I currently have a permissions issue which is preventing the cell from completing and so I cannot confirm that this issue is solved.
This occurs when I run “-r /home/peter/TAO_toolkit/results” in the “tao classification train” command. Since the above permission change I have also run “sudo chmod 771 results” to give the peter group read, write and execute permissions but this does not change the permissions error.
I have also tried restarting my machine (as a sanity check).
As a result I am currently unable to run the notebook. Please advise.
All the path in the command line (for example, $ tao classification train xxx ) should be inside the docker. The xxx is the path inside the docker, i.e., as below.