Error when exporting with --force-ptq

• Xavier NX
• Yolo_v4/Darknet19
• Tao 3.0
yolo_v4_train_resnet18_kitti_seq.txt (2.5 KB)

When exporting the YoloV4 model with --force-ptq enabled I receive an error. I can export without --force-ptq. I would like to run my model on the DLA. Any idea what I am doing wrong?

I export with-

!rm -rf $LOCAL_EXPERIMENT_DIR/export
!mkdir -p $LOCAL_EXPERIMENT_DIR/export
!tao yolo_v4 export -m $USER_EXPERIMENT_DIR/experiment_dir_retrain/weights/yolov4_darknet19_epoch_200.tlt
-o $USER_EXPERIMENT_DIR/export/yolov4_darknet19_epoch_$EPOCH.etlt
-e $SPECS_DIR/yolo_v4_retrain_resnet18_kitti_seq.txt
-k $KEY
–data_type int8
–cal_cache_file $USER_EXPERIMENT_DIR/export/cal.bin
–force_ptq
–gen_ds_config

And receive the following error-

2022-02-24 11:07:44,389 [INFO] root: Registry: [‘nvcr.io’]
2022-02-24 11:07:44,463 [INFO] tlt.components.instance_handler.local_instance: Running command in container: nvcr.io/nvidia/tao/tao-toolkit-tf:v3.21.11-tf1.15.5-py3
2022-02-24 11:07:44,471 [WARNING] tlt.components.docker_handler.docker_handler:
Docker will run the commands as root. If you would like to retain your
local host permissions, please add the “user”:“UID:GID” in the
DockerOptions portion of the “/home/ubuntu/.tao_mounts.json” file. You can obtain your
users UID and GID by using the “id -u” and “id -g” commands on the
terminal.
Using TensorFlow backend.
Using TensorFlow backend.
WARNING:tensorflow:Deprecation warnings have been disabled. Set TF_ENABLE_DEPRECATION_WARNINGS=1 to re-enable them.
2022-02-24 11:07:51,042 [INFO] root: Building exporter object.
2022-02-24 11:07:54,907 [INFO] root: Exporting the model.
2022-02-24 11:07:54,907 [INFO] root: Using input nodes: [‘Input’]
2022-02-24 11:07:54,907 [INFO] root: Using output nodes: [‘BatchedNMS’]
2022-02-24 11:07:54,907 [INFO] iva.common.export.keras_exporter: Using input nodes: [‘Input’]
2022-02-24 11:07:54,907 [INFO] iva.common.export.keras_exporter: Using output nodes: [‘BatchedNMS’]
The ONNX operator number change on the optimization: 483 → 232
2022-02-24 11:11:02,240 [INFO] keras2onnx: The ONNX operator number change on the optimization: 483 → 232
2022-02-24 11:11:02,785 [INFO] iva.common.export.base_exporter: Generating a tensorfile with random tensor images. This may work well as a profiling tool, however, it may result in inaccurate results at inference. Please generate a tensorfile using the tlt-int8-tensorfile, or provide a custom directory of images for best performance.
Traceback (most recent call last):
File “/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/yolo_v4/scripts/export.py”, line 12, in
File “/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/common/export/app.py”, line 265, in launch_export
File “/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/common/export/app.py”, line 247, in run_export
File “/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/common/export/keras_exporter.py”, line 374, in export
File “/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/common/export/base_exporter.py”, line 186, in get_calibrator
File “/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/common/export/base_exporter.py”, line 286, in generate_tensor_file
File “/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/common/export/base_exporter.py”, line 329, in generate_random_tensorfile
File “/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/common/export/tensorfile.py”, line 54, in init
File “/usr/local/lib/python3.6/dist-packages/h5py/_hl/files.py”, line 312, in init
fid = make_fid(name, mode, userblock_size, fapl, swmr=swmr)
File “/usr/local/lib/python3.6/dist-packages/h5py/_hl/files.py”, line 148, in make_fid
fid = h5f.create(name, h5f.ACC_TRUNC, fapl=fapl, fcpl=fcpl)
File “h5py/_objects.pyx”, line 54, in h5py._objects.with_phil.wrapper
File “h5py/_objects.pyx”, line 55, in h5py._objects.with_phil.wrapper
File “h5py/h5f.pyx”, line 98, in h5py.h5f.create
ValueError: Invalid file name (invalid file name)
2022-02-24 11:11:04,218 [INFO] tlt.components.docker_handler.docker_handler: Stopping container.

I can run well with --force_ptq .
Please double check. And also add --cal_data_file and --cal_image_dir .

Below is my test experiment.

root@f52fb6bb8c67:/workspace/demo_3.0/CEPDOF# yolo_v4 export -k key -m result_non-qat/weights/yolov4_cspdarknet53_epoch_060.tlt -e config_train.yml --engine_file test/non_qat.engine --data_type int8 --batch_size 1 --batches 7202 --cal_cache_file test/non_qat.bin --cal_data_file test/cal.tensorfile --cal_image_dir High_activity -o test/output.etlt --force_ptq
Using TensorFlow backend.
Using TensorFlow backend.
WARNING:tensorflow:Deprecation warnings have been disabled. Set TF_ENABLE_DEPRECATION_WARNINGS=1 to re-enable them.
2022-02-24 15:58:32,151 [INFO] root: Building exporter object.
2022-02-24 15:58:45,603 [INFO] root: Exporting the model.
2022-02-24 15:58:45,604 [INFO] root: Using input nodes: ['Input']
2022-02-24 15:58:45,604 [INFO] root: Using output nodes: ['BatchedNMS']
2022-02-24 15:58:45,604 [INFO] iva.common.export.keras_exporter: Using input nodes: ['Input']
2022-02-24 15:58:45,604 [INFO] iva.common.export.keras_exporter: Using output nodes: ['BatchedNMS']
dThe ONNX operator number change on the optimization: 771 -> 363
2022-02-24 16:01:23,701 [INFO] keras2onnx: The ONNX operator number change on the optimization: 771 -> 363
7202it [13:53,  8.64it/s]
2022-02-24 16:15:20,897 [INFO] iva.common.export.keras_exporter: Calibration takes time especially if number of batches is large.
2022-02-24 16:15:20,897 [INFO] root: Calibration takes time especially if number of batches is large.
2022-02-24 16:45:33,738 [INFO] iva.common.export.base_calibrator: Saving calibration cache (size 11340) to /workspace/demo_3.0/CEPDOF/test/non_qat.bin
2022-02-24 16:47:07,925 [INFO] root: Export complete.
2022-02-24 16:47:07,925 [INFO] root: {
    "param_count": 49.44527,
    "size": 61.347960472106934
}
1 Like

Thank you. That solved my problem.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.