Performance of TensorRT conversion of ResNet50 on Quadro P6000

erosengaus99e06 · December 3, 2019, 11:05pm

I am running your image_classification example from your docker image nvcr.io/nvidia/tensorflow:19.11-tf2-py3 as follows

export CUDA_VISIBLE_DEVICES=“0”
python image_classification.py
–data_dir /mytf/imagenet
–input_saved_model_dir /mytf/1
–output_saved_model_dir /mytf/temp
–mode validation
–num_warmup_iterations 50
–use_trt
–optimize_offline
–precision INT8
–max_workspace_size $((2**32))
–batch_size 128
–target_duration 10
–calib_data_dir /mytf/imagenet
–num_calib_inputs 128

The tensorrt conversion completes successfully, but I see no speedup relative to FP32. Upon closer examination of the generated model, the graph nodes retain FP32 types, so the result is not surprising. Given that this is running on a Compute capability 6.1 (Quadro P6000 GPU), why did the converted model not use INT8 as requested above? How do I demonstrate the INT8 performance on this model that is described on your documentation?

SunilJB · December 23, 2019, 4:14am

Hi,

Specifying the precision for a network defines the minimum acceptable precision for the application. Higher precision kernels may be chosen if they are faster for some particular set of kernel parameters, or if no lower-precision kernel exists.
You can set the builder config flag BuilderFlag::kSTRICT_TYPES to force the network or layer precision, which may not have optimal performance. Usage of this flag is only recommended for debugging purposes.

Please refer below link for more details:
https://docs.nvidia.com/deeplearning/sdk/tensorrt-archived/tensorrt-601/tensorrt-developer-guide/index.html#enable_fp16_c

Thanks

Topic		Replies	Views
Acceleration with INT8 precision using TensorRT TensorRT tensorrt , cuda , deep-learning	6	781	February 13, 2021
Classification model of densenet converted to int8 that outputs result is error! TensorRT	4	1135	October 28, 2019
Question about the tensorrt precision transformation TensorRT	4	470	July 12, 2021
TensorRT INT8 Calibration Issue TensorRT tensorrt , tensorflow	7	2200	May 7, 2021
low inference latency for INT8, comaped to FP32, FP16 using Tensorflow 1.13 and TensorRT 5.1.2 TensorRT	1	975	January 24, 2020
High image res & low no of channels -> really bad speed TensorRT	10	909	June 14, 2022
No speed up tensorrt model in inference (xavier) Jetson AGX Xavier tensorrt	4	624	October 18, 2021
TensorRT 2 INT8 samples GPU-Accelerated Libraries	8	5101	August 24, 2017
TRT INT8 Quantify: Accuracy depend on Calibration dataset? TensorRT	9	2709	May 15, 2021
High inference time while running UNet with INT8 precision TensorRT tensorrt	5	984	February 10, 2021

Performance of TensorRT conversion of ResNet50 on Quadro P6000

Related topics