google.protobuf.text_format.ParseError: 60:3 : ' }': Couldn't parse float

pddarrell · August 16, 2022, 6:28pm

Please provide the following information when requesting support.

• Hardware (RGX 3080)
• Network Type (Classification)
• TLT Version format_version: 2.0
toolkit_version: 3.22.05
published_date: 05/25/2022()
• Training spec file
• How to reproduce the issue ?

When I run “!tao classification train -e $LOCAL_SPECS_DIR -r $LOCAL_EXPERIMENT_DIR/classification -k $KEY”

The container stops with “google.protobuf.text_format.ParseError: 60:3 : ’ }': Couldn’t parse float: }”
Looking at my experiment spec file I changed the adam optimizer epsilon from ‘1e-7’ to ‘0.00000001’, but that did not change the ParseError.

Can you offer any suggestions?

Morganh · August 17, 2022, 2:47am

Can you change back to 1e-7 to check if it works?

pddarrell · August 17, 2022, 8:26am

It doesn’t work. (I originally changed it from 1e-7 to see if that was the problem)

Morganh · August 17, 2022, 1:39pm

Please add “}” in the end.
I find that you did not set “}” in the last eval_config.

pddarrell · August 17, 2022, 2:27pm

You’re right.
That is strange, because the “}” is there in my text editor.

I added an empty line to the end and now the “}” shows

but when I run the notebook I still get:
“google.protobuf.text_format.ParseError: 60:3 : ’ }': Couldn’t parse float: }”

Morganh · August 18, 2022, 2:21pm

I am afraid there are some unexpected hidden characters in your spec file.
Suggest you to check further.

Or you can copy the spec file in the notebook and then modify each parameter to your expected.

pddarrell · August 18, 2022, 3:15pm

Hi Morganh,

some of your language is slightly ambiguous:

There doesn’t appear to be a spec file in the notebook, so do you mean copy the spec file into the notebook? I am not very comfortable with that idea.

If I did do that, would I copy it from:

and exactly where in the notebook would I copy it to in order for it to work the way it is meant to?

Morganh · August 18, 2022, 3:18pm

Please refer to TAO Toolkit Quick Start Guide — TAO Toolkit 3.22.05 documentation
You can download notebook via

wget --content-disposition https://api.ngc.nvidia.com/v2/resources/nvidia/tao/cv_samples/versions/v1.4.1/zip -O cv_samples_v1.4.1.zip
unzip -u cv_samples_v1.4.1.zip  -d ./cv_samples_v1.4.1 && rm -rf cv_samples_v1.4.1.zip && cd ./cv_samples_v1.4.1

There should be a sample spec file for classification network.

pddarrell · August 18, 2022, 4:58pm

Thank you for this.

I have completely rebuilt the training_spec document. The problem seems to have occurred when I made it a .json, at which point sections of the text turn red (in MS Visual Studio). I have therefore made it a .cfg, which makes the text revert to white . I have also gone back through the Jupyter notebook and changed all mentions of training_spec.json to training_spec.cfg.

I currently have a permissions issue which is preventing the cell from completing and so I cannot confirm that this issue is solved.

Morganh · August 19, 2022, 2:18am

You can open a terminal and try to save the spec file.
Then open jupter notebook again.

pddarrell · August 19, 2022, 5:02am

Thank you Morganh.

I am unable to access my machine until September.

yingliu · September 6, 2022, 5:58am

Hello @pddarrell Do you have any updates on this topic?

pddarrell · September 6, 2022, 7:52am

Thank you for checking in Yingliu.

I am resuming the project today and I hope to get back
to you before the end of the day (currently ~ 09.00am here).

Many thanks

pddarrell · September 6, 2022, 10:31am

Hi again.

The permissions error I have persists even when I change “root:root” permissions to “peter:peter”

This occurs when I run “-r /home/peter/TAO_toolkit/results” in the “tao classification train” command. Since the above permission change I have also run “sudo chmod 771 results” to give the peter group read, write and execute permissions but this does not change the permissions error.
I have also tried restarting my machine (as a sanity check).
As a result I am currently unable to run the notebook. Please advise.

Morganh · September 7, 2022, 2:34am

Can you share your ~/.tao_mounts.json ?

Also, please check if below can help you.
Please try to remove the following from the ~/.tao_mounts.json to check if it works.

    "DockerOptions": {
        "user": "1000:1000"

Reference: Permission Denied Error When training MASK RCNN - #12 by subhankar.halder

pddarrell · September 7, 2022, 3:00pm

Also, please check if below can help you.
Please try to remove the following from the ~/.tao_mounts.json to check if it works.

    "DockerOptions": {
        "user": "1000:1000"

Here it is in the bash:

Here is the double quote error it throws:

Here is the same thing in MS Visual Studio used to check what “line 17” is

Line 17 is the final curly brace.

I have restored the following to ~/.tao_mounts at present
“DockerOptions”: {
“user”: “1000:1000”

Morganh · September 7, 2022, 3:51pm

Change line 15
],
to
]

and retry.

pddarrell · September 7, 2022, 4:06pm

Apologies for the “,”

It throws a different Errno (2, instead if 13), but it is still failing to find ‘/home/peter/TAO_toolkit/data/train’

Morganh · September 7, 2022, 4:12pm

All the path in the command line (for example, $ tao classification train xxx ) should be inside the docker. The xxx is the path inside the docker, i.e., as below.

pddarrell · September 8, 2022, 10:03am

I have been using the format as described here:

and, as a result my command line shows this:

Are you saying that I am following the wrong instructions?
(I already completed the tao_voc exercise)

Topic		Replies	Views
PIL.UnidentifiedImageError: cannot identify image file <_io.BytesIO TAO Toolkit	88	15491	October 27, 2022
PermissionError: [Errno 13] Permission denied: trying to train classification_tf1 TAO Toolkit	7	336	June 25, 2024
Please reopen "[Errno 2] No such file or directory:when ‘tao classification train’ is run" again, it is still not solved TAO Toolkit	11	579	September 17, 2022
Error when: classification_pyt train -e ./spec.txt TAO Toolkit	5	158	July 9, 2024
TAO Toolkit 5.2 (5.2.0.1-pyt1.14.0:Segformer) - OSError: [Errno 39] Directory not empty: '/results/train/.eval_hook' TAO Toolkit	10	503	March 8, 2024
FileNotFoundError: Model not found TAO Toolkit	5	111	July 27, 2024
Error in TAO-Toolkit while training TAO Toolkit	15	1497	July 6, 2022
Error while training detectnet v2 taotollkit on default notebook TAO Toolkit	2	307	March 9, 2024
OSError: Specfile not found plz help TAO Toolkit	16	1584	October 12, 2021
Errors during training in TAO TAO Toolkit	3	383	January 6, 2024

google.protobuf.text_format.ParseError: 60:3 : ' }': Couldn't parse float

Related topics