Errors: tlt-export TLT YOLO model to INT8 calibration

wdw0908 · May 12, 2020, 2:23am

Hi all,

I am running the notebook YOLO example of TLT Version 2 release. I have successfully trained, pruned, and retrained the YOLO-restnet18. The results of FP16 and FP32’s are pretty good but when I executed the tlt-export (int8) command as shown in the notebook:
!tlt-export yolo -m $USER_EXPERIMENT_DIR/experiment_dir_retrain/weights/yolo_resnet18_epoch_$EPOCH.tlt \ -o $USER_EXPERIMENT_DIR/export/yolo_resnet18_epoch_$EPOCH.etlt \ -e $SPECS_DIR/yolo_retrain_resnet18_kitti.txt \ -k $KEY \ --cal_image_dir $USER_EXPERIMENT_DIR/data/val/image_2 \ --data_type int8 \ --batch_size 1 \ --batches 10 \ --cal_cache_file $USER_EXPERIMENT_DIR/export/cal.bin \ --cal_data_file $USER_EXPERIMENT_DIR/export/cal.tensorfile

But it failed in generating int8 implementation as following:

Using TensorFlow backend.
2020-05-12 02:09:02,221 [INFO] /usr/local/lib/python2.7/dist-packages/iva/yolo/utils/spec_loader.pyc: Merging specification from /workspace/examples/yolo/specs/yolo_retrain_resnet18_kitti.txt
2020-05-12 02:09:05,844 [INFO] /usr/local/lib/python2.7/dist-packages/iva/yolo/utils/spec_loader.pyc: Merging specification from /workspace/examples/yolo/specs/yolo_retrain_resnet18_kitti.txt
NOTE: UFF has been tested with TensorFlow 1.14.0.
WARNING: The version of TensorFlow installed on this system is not guaranteed to work with UFF.
Warning: No conversion function registered for layer: BatchedNMS_TRT yet.
Converting BatchedNMS as custom op: BatchedNMS_TRT
Warning: No conversion function registered for layer: ResizeNearest_TRT yet.
Converting upsample1/ResizeNearestNeighbor as custom op: ResizeNearest_TRT
Warning: No conversion function registered for layer: ResizeNearest_TRT yet.
Converting upsample0/ResizeNearestNeighbor as custom op: ResizeNearest_TRT
Warning: No conversion function registered for layer: BatchTilePlugin_TRT yet.
Converting FirstDimTile_2 as custom op: BatchTilePlugin_TRT
Warning: No conversion function registered for layer: BatchTilePlugin_TRT yet.
Converting FirstDimTile_1 as custom op: BatchTilePlugin_TRT
Warning: No conversion function registered for layer: BatchTilePlugin_TRT yet.
Converting FirstDimTile_0 as custom op: BatchTilePlugin_TRT
DEBUG [/usr/lib/python2.7/dist-packages/uff/converters/tensorflow/converter.py:96] Marking [‘BatchedNMS’] as outputs
2020-05-12 02:09:22,062 [WARNING] modulus.export._tensorrt: Calibration file /workspace/tlt-experiments/yolo/export/cal.bin exists but is being ignored.
[TensorRT] INFO: Detected 1 inputs and 4 output network tensors.
[TensorRT] WARNING: Current optimization profile is: 0. Please ensure there are no enqueued operations pending in this context prior to switching profiles
[TensorRT] INFO: Starting Calibration with batch size 1.
DEPRECATED: This variant of get_batch is deprecated. Please use the single argument variant described in the documentation instead.
[TensorRT] INFO: Calibrated batch 0 in 0.101843 seconds.
[TensorRT] INFO: Calibrated batch 1 in 0.0926203 seconds.
[TensorRT] INFO: Calibrated batch 2 in 0.0919941 seconds.
[TensorRT] INFO: Calibrated batch 3 in 0.0910869 seconds.
[TensorRT] INFO: Calibrated batch 4 in 0.0929871 seconds.
[TensorRT] INFO: Calibrated batch 5 in 0.0934323 seconds.
[TensorRT] INFO: Calibrated batch 6 in 0.099967 seconds.
[TensorRT] INFO: Calibrated batch 7 in 0.104515 seconds.
[TensorRT] INFO: Calibrated batch 8 in 0.0996476 seconds.
[TensorRT] INFO: Calibrated batch 9 in 0.0921785 seconds.
[TensorRT] WARNING: Tensor BatchedNMS is uniformly zero; network calibration failed.
[TensorRT] WARNING: Tensor BatchedNMS_1 is uniformly zero; network calibration failed.
[TensorRT] WARNING: Tensor BatchedNMS_2 is uniformly zero; network calibration failed.
[TensorRT] INFO: Post Processing Calibration data in 3.91784 seconds.
[TensorRT] INFO: Calibration completed in 37.7898 seconds.
2020-05-12 02:09:59,897 [WARNING] modulus.export._tensorrt: Calibration file /workspace/tlt-experiments/yolo/export/cal.bin exists but is being ignored.
[TensorRT] INFO: Writing Calibration Cache for calibrator: TRT-7000-EntropyCalibration2
2020-05-12 02:09:59,897 [INFO] modulus.export._tensorrt: Saving calibration cache (size 9237) to /workspace/tlt-experiments/yolo/export/cal.bin
[TensorRT] WARNING: Rejecting int8 implementation of layer BatchedNMS due to missing int8 scales, will choose a non-int8 implementation.
[TensorRT] INFO: Detected 1 inputs and 4 output network tensors.

I am fresh new to TLT and TensorRT, so is there any idea what is going on and any potential solution? I google-searched several forums but noting related to TLT version 2.

I appreciate your help! Thanks!

Morganh · May 12, 2020, 3:13am

As shown in the last cell of yolo notebook, please re-run again.

Note: If you see an error can not find BatchedNMS or None type is not iterable in running the cell below. Please re-run the tlt-converter cell. This usually means the engine is not built successfully.

wdw0908 · May 12, 2020, 3:38am

Hi Morgan,

Thanks for your reply. Actually, my issues is in generating int8 cache with tlt-export, rather than tlt-infer or tlt-converter, especially in the cell showing:

[TensorRT] WARNING: Rejecting int8 implementation of layer BatchedNMS due to missing int8 scales, will choose a non-int8 implementation.

I did not meet any issue in running tlt-converter and tlt-infer with trt engine. The result images from yolo_infer_images do not contain bounding boxes and it looks like it failed in the part of int8 calibration.

Morganh · May 12, 2020, 4:37am

For “tlt-infer”, it is not related to etlt model. When run “tlt-infer”, please set tlt model as its input.
So, the int8/fp16/fp32 exporting will not affect tlt-infer result.

wdw0908 · May 12, 2020, 4:59am

Thanks for your explanation. But my issue happens in running the tlt-export to generate calibration cache not tlt-infer, and I am confused by my failure in building int8 implementation only.

Please refer to the warning message: [TensorRT] WARNING: Rejecting int8 implementation of layer BatchedNMS due to missing int8 scales, will choose a non-int8 implementation.

Morganh · May 12, 2020, 5:00am

I am checking it. Will update to you if any findings.

wdw0908 · May 12, 2020, 6:26am

Thank you!

Morganh · May 13, 2020, 3:33am

Hi wdw0908,
You can ignore the warning log. Because BatchTilePlugin need not to have int8 implementation. It is for anchor box. Those are constants.

Actually you already get the correct output file: etlt file and cal.bin file. Please go ahead.