Description
Hi,
I’m trying to convert models from PyTorch → ONNX → TensorRT. Optimally, I would like to use INT8 and support dynamic input size.
I seem to be able to create an INT8 calibrated model if I use builder.build_cuda_engine(network)
and use optimization profiles for dynamic input support if I use builder.build_engine(network, config)
.
The latter option seems to always ignore the int8_calibrator
regardless if I set it in the builder
or the config
objects and even if I remove the dynamic shape optimizations (see code snippet below).
Please let me know if what I’m trying here is not supported or any other way to make this work…
Thanks!
Environment
TensorRT Version:
GPU Type: T4
Nvidia Driver Version:
CUDA Version:
CUDNN Version:
Operating System + Version:
Python Version (if applicable):
TensorFlow Version (if applicable):
PyTorch Version (if applicable):
Baremetal or Container (if container which image + tag): nvcr.io/nvidia/pytorch:20.11-py3
Relevant Files
Steps To Reproduce
def build_engine(onnx_file_path, input_name, int8_calibrator=None,
max_batch_size=1, img_size=None, min_size=None, max_size=None):
# initialize TensorRT engine and parse ONNX model
with trt.Builder(TRT_LOGGER) as builder, builder.create_builder_config() as config:
builder = trt.Builder(TRT_LOGGER)
network_creation_flag = 1 << int(trt.NetworkDefinitionCreationFlag.EXPLICIT_BATCH)
network = builder.create_network(network_creation_flag)
parser = trt.OnnxParser(network, TRT_LOGGER)
# parse ONNX
with open(onnx_file_path, 'rb') as model:
print('Beginning ONNX file parsing')
parser.parse(model.read())
print('Completed parsing of ONNX file')
# allow TensorRT to use up to 8GB of GPU memory for tactic selection
config.max_workspace_size = 8 << 30
# use FP16 mode if possible
if builder.platform_has_fast_fp16:
builder.fp16_mode = True
print('USING FP16!!!')
if int8_calibrator is not None:
builder.int8_mode = True
config.int8_calibrator = int8_calibrator
builder.int8_calibrator = int8_calibrator
print('USING INT8!!!', builder.platform_has_fast_int8)
# # Dynamic input support - commented out for testing (still int8 calibration is not working)
# if img_size is not None: # dynamic
# opt_min, opt_max = min(img_size), max(img_size)
# # landscape profile
# profile = builder.create_optimization_profile()
# profile.set_shape(input_name, min=(1, 3, min_size, opt_max), opt=(max_batch_size, 3, opt_min, opt_max),
# max=(max_batch_size, 3, opt_max, opt_max))
# config.add_optimization_profile(profile)
#
# # portrait profile
# profile = builder.create_optimization_profile()
# profile.set_shape(input_name, min=(1, 3, opt_max, min_size), opt=(max_batch_size, 3, opt_max, opt_min),
# max=(max_batch_size, 3, opt_max, opt_max))
# config.add_optimization_profile(profile)
# generate TensorRT engine optimized for the target platform
print('Building an engine...')
# engine = builder.build_cuda_engine(network)
engine = builder.build_engine(network, config)
print("Completed creating Engine")
return engine