Build yolov8 QAT model int8 engine failed
I am training a QAT model, with ultralytic framework,
yolov8-seg model
after export to deploy_model.onnx, and dynamic_range.json, I use python tensorrt API to build the engine. the code is
import onnx
import pycuda.autoinit # noqa F401
import tensorrt as trt
import json
import os
import numpy as np
import argparse
def onnx2trt(onnx_model,
trt_path,
# dataset_path,
batch_size=1,
cali_batch=10,
log_level=trt.Logger.ERROR,
max_workspace_size=1 << 30,
device_id=0,
mode='fp32',
is_explicit=False,
dynamic_range_file=None):
if os.path.exists(trt_path):
print(f'The "{trt_path}" exists. Remove it and continue.')
os.remove(trt_path)
# create builder and network
logger = trt.Logger(log_level)
builder = trt.Builder(logger)
EXPLICIT_BATCH = 1 << int(trt.NetworkDefinitionCreationFlag.EXPLICIT_BATCH)
network = builder.create_network(EXPLICIT_BATCH)
parser = trt.OnnxParser(network, logger)
if isinstance(onnx_model, str):
onnx_model = onnx.load(onnx_model)
if not parser.parse(onnx_model.SerializeToString()):
error_msgs = ''
for error in range(parser.num_errors):
error_msgs += f'{parser.get_error(error)}\n'
raise RuntimeError(f'parse onnx failed:\n{error_msgs}')
config = builder.create_builder_config()
config.set_memory_pool_limit(trt.MemoryPoolType.WORKSPACE, max_workspace_size)
if mode == 'int8':
config.set_flag(trt.BuilderFlag.INT8)
if dynamic_range_file:
with open(dynamic_range_file, 'r') as f:
dynamic_range = json.load(f)['tensorrt']['blob_range']
for input_index in range(network.num_inputs):
input_tensor = network.get_input(input_index)
if input_tensor.name in dynamic_range:
print("input_tensor.name", input_tensor.name)
amax = dynamic_range[input_tensor.name]
input_tensor.dynamic_range = (-amax, amax)
print(f'Set dynamic range of {input_tensor.name} as [{-amax}, {amax}]')
for layer_index in range(network.num_layers):
layer = network[layer_index]
output_tensor = layer.get_output(0)
print("output_tensor.name", output_tensor.name)
if output_tensor.name in dynamic_range:
amax = dynamic_range[output_tensor.name]
output_tensor.dynamic_range = (-amax, amax)
print(f'Set dynamic range of {output_tensor.name} as [{-amax}, {amax}]')
# create engine
serialized_engine = builder.build_engine(network, config)
# import ipdb
# ipdb.set_trace()
#
with open(trt_path, "wb") as f:
f.write(serialized_engine)
#
runtime = trt.Runtime(logger)
engine = runtime.deserialize_cuda_engine(serialized_engine)
return engine
my machine env is : tensorRT 8.6, cuda11.4
so the error happens when building the engine:
[07/24/2025-17:28:53] [TRT] [V] *************** Autotuning Reformat: Float(1:4,1600,40,1) -> Int8(9600,1600:32,40,1) ***************
[07/24/2025-17:28:53] [TRT] [V] --------------- Timing Runner: /Slice_8_output_0 copy (Reformat[0x80000006])
[07/24/2025-17:28:53] [TRT] [V] Skipping tactic 0x00000000000003e8 due to exception an illegal memory access was encountered
[07/24/2025-17:28:53] [TRT] [V] /Slice_8_output_0 copy (Reformat[0x80000006]) profiling completed in 0.00831625 seconds. Fastest Tactic: 0xd15ea5edd15ea5ed Time: inf
[07/24/2025-17:28:53] [TRT] [V] --------------- Timing Runner: /Slice_8_output_0 copy (MyelinReformat[0x80000035])
[07/24/2025-17:28:53] [TRT] [V] MyelinReformat has no valid tactics for this config, skipping
[07/24/2025-17:28:53] [TRT] [V] Deleting timing cache: 3616 entries, served 4867 hits since creation.
[07/24/2025-17:28:53] [TRT] [E] 2: Impossible to reformat.
[07/24/2025-17:28:53] [TRT] [E] 1: [cudaDriverHelpers.cpp::operator()::94] Error Code 1: Cuda Driver (an illegal memory access was encountered)
[07/24/2025-17:28:53] [TRT] [E] 1: [cudaDriverHelpers.cpp::operator()::94] Error Code 1: Cuda Driver (an illegal memory access was encountered)
[07/24/2025-17:28:54] [TRT] [E] 1: [cudaDriverHelpers.cpp::operator()::94] Error Code 1: Cuda Driver (an illegal memory access was encountered)
[07/24/2025-17:28:54] [TRT] [E] 1: [cudaDriverHelpers.cpp::operator()::94] Error Code 1: Cuda Driver (an illegal memory access was encountered)
[07/24/2025-17:28:54] [TRT] [E] 1: [cudaDriverHelpers.cpp::operator()::94] Error Code 1: Cuda Driver (an illegal memory access was encountered)
[07/24/2025-17:28:54] [TRT] [E] 1: [cudaResources.cpp::~ScopedCudaStream::47] Error Code 1: Cuda Runtime (an illegal memory access was encountered)
[07/24/2025-17:28:54] [TRT] [E] 2: [optimizer.cpp::computeCosts::4194] Error Code 2: Internal Error (Impossible to reformat.)
Environment
TensorRT Version: 8.6
GPU Type: 3080TI
Nvidia Driver Version:
CUDA Version: 11.4
CUDNN Version:
Operating System + Version: 20.04
Python Version (if applicable): 3.10
TensorFlow Version (if applicable):
PyTorch Version (if applicable): 2.7.1
Baremetal or Container (if container which image + tag):