Question on exporting YOLOv3 QAT model

Hello, I’m currently working on a YOLOv3 model with ResNet-18 backbone. The model is pruned and then retrained with Quantization-Aware-Training, now I’m trying to export the model and generate the cache file using tlt-export command, below is the exact command I used:

tlt-export yolo -k $KEY
-m $PATH_TO_RETRAINED_MODEL/yolo_res18_qat.tlt
-e $PATH_TO_SPEC/yolo_retrain_resnet18_kitti.txt
–data_type int8 --batch_size 16
–cal_cache_file $MODEL_OUTPATH/yolo_res18_qat/cal.bin
-o $MODEL_OUTPATH/yolo_res18_qat/yolo_res18_qat.etlt

However, I got an error message attached below

NOTE: UFF has been tested with TensorFlow 1.14.0.
WARNING: The version of TensorFlow installed on this system is not guaranteed to work with UFF.
Warning: No conversion function registered for layer: BatchedNMS_TRT yet.
Converting BatchedNMS as custom op: BatchedNMS_TRT
Warning: No conversion function registered for layer: ResizeNearest_TRT yet.
Converting upsample1/ResizeNearestNeighbor as custom op: ResizeNearest_TRT
Warning: No conversion function registered for layer: ResizeNearest_TRT yet.
Converting upsample0/ResizeNearestNeighbor as custom op: ResizeNearest_TRT
Warning: No conversion function registered for layer: BatchTilePlugin_TRT yet.
Converting FirstDimTile_2 as custom op: BatchTilePlugin_TRT
Warning: No conversion function registered for layer: BatchTilePlugin_TRT yet.
Converting FirstDimTile_1 as custom op: BatchTilePlugin_TRT
Warning: No conversion function registered for layer: BatchTilePlugin_TRT yet.
Converting FirstDimTile_0 as custom op: BatchTilePlugin_TRT
DEBUG [/usr/local/lib/python3.6/dist-packages/uff/converters/tensorflow/converter.py:96] Marking [‘BatchedNMS’] as outputs
2020-11-18 09:50:50,183 [INFO] iva.common.export.base_exporter: Generating a tensorfile with random tensor images. This may work well as a profiling tool, however, it may result in inaccurate results at inference. Please generate a tensorfile using the tlt-int8-tensorfile, or provide a custom directory of images for best performance.
Traceback (most recent call last):
File “/usr/local/bin/tlt-export”, line 8, in
sys.exit(main())
File “/home/vpraveen/.cache/dazel/_dazel_vpraveen/715c8bafe7816f3bb6f309cd506049bb/execroot/ai_infra/bazel-out/k8-py3-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/common/export/app.py”, line 185, in main
File “/home/vpraveen/.cache/dazel/_dazel_vpraveen/715c8bafe7816f3bb6f309cd506049bb/execroot/ai_infra/bazel-out/k8-py3-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/common/export/app.py”, line 263, in run_export
File “/home/vpraveen/.cache/dazel/_dazel_vpraveen/715c8bafe7816f3bb6f309cd506049bb/execroot/ai_infra/bazel-out/k8-py3-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/common/export/base_exporter.py”, line 478, in export
File “/home/vpraveen/.cache/dazel/_dazel_vpraveen/715c8bafe7816f3bb6f309cd506049bb/execroot/ai_infra/bazel-out/k8-py3-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/common/export/base_exporter.py”, line 206, in get_calibrator
File “/home/vpraveen/.cache/dazel/_dazel_vpraveen/715c8bafe7816f3bb6f309cd506049bb/execroot/ai_infra/bazel-out/k8-py3-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/common/export/base_exporter.py”, line 295, in generate_tensor_file
File “/home/vpraveen/.cache/dazel/_dazel_vpraveen/715c8bafe7816f3bb6f309cd506049bb/execroot/ai_infra/bazel-out/k8-py3-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/common/export/base_exporter.py”, line 337, in generate_random_tensorfile
File “/home/vpraveen/.cache/dazel/_dazel_vpraveen/715c8bafe7816f3bb6f309cd506049bb/execroot/ai_infra/bazel-out/k8-py3-fastbuild/bin/magnet/packages/core/build_wheel.runfiles/ai_infra/moduluspy/modulus/export/data.py”, line 54, in init
File “/usr/local/lib/python3.6/dist-packages/h5py/_hl/files.py”, line 312, in init
fid = make_fid(name, mode, userblock_size, fapl, swmr=swmr)
File “/usr/local/lib/python3.6/dist-packages/h5py/_hl/files.py”, line 148, in make_fid
fid = h5f.create(name, h5f.ACC_TRUNC, fapl=fapl, fcpl=fcpl)
File “h5py/_objects.pyx”, line 54, in h5py._objects.with_phil.wrapper
File “h5py/_objects.pyx”, line 55, in h5py._objects.with_phil.wrapper
File “h5py/h5f.pyx”, line 98, in h5py.h5f.create
ValueError: Invalid file name (invalid file name)

Please refer to the commands inside the jupyter notebook or tlt-user guide.
For example, below is needed.

  • --cal_data_file : tensorfile generated from tlt-int8-tensorfile for calibrating the engine. This can also be an output file if used with --cal_image_dir .
  • --cal_image_dir : Directory of images to use for calibration.

It is needed to generate cal.bin. Not related to QAT.

I see those two arguments, however, my question is are they still needed while my input model is trained with QAT? For what I understood, the dynamic ranges for int8 are already computed during the quantization aware training, and thus there is no need to calibrate one more time when exporting the model.

Here is a note quoted from tlt-export section in the official documentation:

When exporting a model trained with QAT enabled, the tensor scale factors to calibrate the activations are peeled out of the model and serialized to a TensorRT readable cache file defined by the cal_cache_file argument.
@Morganh

I see those two arguments, however, my question is are they still needed while my input model is trained with QAT? For what I understood, the dynamic ranges for int8 are already computed during the quantization aware training, and thus there is no need to calibrate one more time when exporting the model. @Morganh

Yes, it is still needed.

All right, understood, thanks. It just might cause confusion.

I’m confused here too, facing the same problem. The documentation clearly state that the cal_image_dir is not needed and that the calibration scaling factors are extracted from the .tlt file because it was trained with QAT enabled.

I will check further if there is an issue.