The compilation completes successfully, but when I load and run the model on the COCO dataset, I get identical outputs for every input. It is worth noting that using an FP16 engine works correctly on the same COCO dataset with the same commend line just the --fp16 flag.
Environment
TensorRT Version: 10.3 GPU Type: Tesla Nvidia Driver Version: CUDA Version: 12.1 CUDNN Version: Operating System + Version: Ubuntu 20.04 Python Version (if applicable):3.10.14 TensorFlow Version (if applicable): PyTorch Version (if applicable): 2.1.0+cu121 Baremetal or Container (if container which image + tag):
Relevant Files
Please attach or include links to any models, data, files, or scripts necessary to reproduce your issue. (Github repo, Google Drive, Dropbox, etc.)
I’ve had the same issue with ResNet-50 trained on ImageNet-1k when I calibrate a model using the Python API (set the trt.BuilderFlag.INT8/FP16 when building an engine file for my model).
I am using the torchvision version of ResNet50: PyTorch docs
And am using a calibrator based on the IInt8EntropyCalibrator2: TensorRT GitHub
My environment is similar to the one above:
Environment
TensorRT Version: 10.0.1 GPU Type: Tesla Nvidia Driver Version: 10.1 (nvcc --version)/550.54.15 (nvidia-smi) CUDA Version: 12.4 CUDNN Version: Operating System + Version: Ubuntu 20.04.6 Python Version (if applicable): 3.10.13 TensorFlow Version (if applicable): PyTorch Version (if applicable): 2.2.0+cu121 Baremetal or Container (if container which image + tag): Baremetal
Relevant Files
Python function used for creating a TRT engine, including in int8:
def create_engine(self, engine_path, calibrator=None):
"""
Build the TensorRT engine and serialize it to disk.
:param engine_path: The path where to serialize the engine to.
"""
engine_path = os.path.realpath(engine_path)
engine_dir = os.path.dirname(engine_path)
os.makedirs(engine_dir, exist_ok=True)
self.log.info("Building {} Engine in {}".format(self.precision, engine_path))
inputs = [self.network.get_input(i) for i in range(self.network.num_inputs)]
profile = self.builder.create_optimization_profile()
model_input_name = inputs[0].name
profile.set_shape(
input=model_input_name, #name of input tensor - must match first layer in onnx model
min=[self.min_batch] + list(self.input_shape), # minimum input size
opt=[self.opt_batch] + list(self.input_shape), # optimal input size
max=[self.max_batch] + list(self.input_shape) # maximum input size
)
self.config.add_optimization_profile(profile)
self.config.set_calibration_profile(profile)
self.config.profiling_verbosity = trt.ProfilingVerbosity.DETAILED
# https://docs.nvidia.com/deeplearning/tensorrt/developer-guide/index.html#reduced-precision
if self.precision == 'fp16':
self.config.set_flag(trt.BuilderFlag.FP16)
elif self.precision == 'int8':
self.config.int8_calibrator = calibrator
self.config.set_flag(trt.BuilderFlag.FP16)
self.config.set_flag(trt.BuilderFlag.INT8)
with open(engine_path, "wb") as f:
f.write(self.builder.build_serialized_network(self.network, self.config))
self.engine_file = engine_path