QAT model evaluation error

Please provide the following information when requesting support.

• Hardware (T4/V100/Xavier/Nano/etc)
• Network Type (Detectnet_v2/Faster_rcnn/Yolo_v4/LPRnet/Mask_rcnn/Classification/etc) Classification_TF2
• TLT Version (Please run “tlt info --verbose” and share “docker_tag” here) 4.0.1-tf2.9.1
• Training spec file(If have, please share here) /getting_started_v4.0.1/notebooks/tao_launcher_starter_kit/classification_tf2/tao_voc/specs/spec_retrain_qat.yaml
• How to reproduce the issue ? (This is for errors. Please share the command line and the detailed log here.)
From virtual environment, open the jupyter notebook /getting_started_v4.0.1/notebooks/tao_launcher_starter_kit/classification_tf2/tao_voc/classification.ipynb and run the cells.
Train the model, prune and then when I run the retrain with QAT, it trains correctly, but during evaluation, first of all it by default always choses the fully dense model. But if I do manually overwrite the path and make it pick the QAT model, it gives the follwoing error


Complete error log:
Total params: 895,156
Trainable params: 856,816
Non-trainable params: 38,340


‘Conv2DQuantizeWrapper’ object has no attribute ‘data_format’
Error executing job with overrides:
Traceback (most recent call last):
File “/usr/local/lib/python3.8/dist-packages/hydra/_internal/utils.py”, line 211, in run_and_report
return func()
File “/usr/local/lib/python3.8/dist-packages/hydra/_internal/utils.py”, line 368, in
lambda: hydra.run(
File “/usr/local/lib/python3.8/dist-packages/hydra/_internal/hydra.py”, line 110, in run
_ = ret.return_value
File “/usr/local/lib/python3.8/dist-packages/hydra/core/utils.py”, line 233, in return_value
raise self._return_value
File “/usr/local/lib/python3.8/dist-packages/hydra/core/utils.py”, line 160, in run_job
ret.return_value = task_function(task_cfg)
File “”, line 169, in main
File “”, line 76, in _func
File “”, line 49, in _func
File “”, line 56, in run_evaluate
File “”, line 120, in get_input_shape
AttributeError: ‘Conv2DQuantizeWrapper’ object has no attribute ‘data_format’

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File “</usr/local/lib/python3.8/dist-packages/nvidia_tao_tf2/cv/classification/scripts/evaluate.py>”, line 3, in
File “”, line 173, in
File “”, line 87, in wrapper
File “/usr/local/lib/python3.8/dist-packages/hydra/_internal/utils.py”, line 367, in _run_hydra
run_and_report(
File “/usr/local/lib/python3.8/dist-packages/hydra/_internal/utils.py”, line 251, in run_and_report
assert mdl is not None
AssertionError
Sending telemetry data.
Telemetry data couldn’t be sent, but the command ran successfully.
[Error]: <urlopen error [Errno -2] Name or service not known>
Execution status: FAIL

Will check further internally.

It is an issue for 4.0. The coming TAO5.0 release will fix it.

Thank you
Do you know when TAO 5.0 will be released?

It will be available very soon.

The 5.0 docker is available in TAO Toolkit | NVIDIA NGC
For Classification_TF2 network, please pull nvcr.io/nvidia/tao/tao-toolkit:5.0.0-tf2.11.0

1 Like

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.