Description
Hi, I’m trying to convert my ONNX model to TensorRT format, but encounter an error as below.
Can anyone help me to solve the problem?
My ONNX model is here [https://83516952-my.sharepoint.com/:u:/g/personal/eddie_hsiao_insign-medical_com/ESS-N89ev6JIgnqv9O5TjzMBu1JbjaDy2VkQqhJEq1K0wQ?e=RrnLTA](https://onnx model )
[02/24/2023-10:40:26] [TRT] [W] Skipping tactic 0x0000000000000000 due to exception autotuning: CUDA error 2 allocating 4362077693-byte buffer: out of memory
[02/24/2023-10:40:26] [TRT] [E] 10: [optimizer.cpp::computeCosts::3728] Error Code 10: Internal Error (Could not find any implementation for node {ForeignNode[/neck/multi_att/Constant_8_output_0 + (Unnamed Layer* 12521) [Shuffle].../neck/multi_att/Reshape_18 + /neck/multi_att/Transpose_9]}.)
[02/24/2023-10:40:27] [TRT] [E] 2: [builder.cpp::buildSerializedNetwork::751] Error Code 2: Internal Error (Assertion engine != nullptr failed. )
Environment
TensorRT Version: 8.5.3.1
GPU Type: RTX3090
Nvidia Driver Version: 510.108.03
CUDA Version: 11.6
CUDNN Version:
Operating System + Version: Ubuntu20
Python Version (if applicable): 3.8
PyTorch Version (if applicable):
Steps To Reproduce
my python script
import os
import logging
import argparse
import tensorrt as trt
import pycuda.driver as cuda
import pycuda.autoinit
from calibrator import *
parser = argparse.ArgumentParser()
parser.add_argument('--onnx', default='./tensorrt/depthformer.onnx')
parser.add_argument('--fp16', action='store_true')
parser.add_argument('--int8', action='store_true')
parser.add_argument('--savepth', default='./model.trt')
args = parser.parse_args()
cali = CenterNetEntropyCalibrator()
ctx = pycuda.autoinit.context
trt.init_libnvinfer_plugins(None, "")
TRT_LOGGER = trt.Logger()
def build_engine_from_onnx(onnx_file_path):
engine = None
EXPLICIT_BATCH = 1 << (int)(trt.NetworkDefinitionCreationFlag.EXPLICIT_BATCH)
with trt.Builder(TRT_LOGGER) as builder, builder.create_network(EXPLICIT_BATCH) as network, builder.create_builder_config() as config, trt.OnnxParser(network, TRT_LOGGER) as parser, trt.Runtime(TRT_LOGGER) as runtime:
config.max_workspace_size = 1024 << 30 # 1G
if args.fp16:
print('*********************************************FP16 conversion')
config.set_flag(trt.BuilderFlag.FP16)
elif args.int8:
print('*********************************************INT8 conversion')
config.set_flag(trt.BuilderFlag.INT8)
config.int8_calibrator = cali
builder.max_batch_size = 128
# Parse model file
assert os.path.exists(onnx_file_path), f'cannot find {onnx_file_path}'
print(f'Loading ONNX file from path {onnx_file_path}...')
with open(onnx_file_path, 'rb') as fr:
if not parser.parse(fr.read()):
print ('ERROR: Failed to parse the ONNX file.')
for error in range(parser.num_errors):
print (parser.get_error(error))
assert False
print("Start to build Engine")
plan = builder.build_serialized_network(network, config)
engine = runtime.deserialize_cuda_engine(plan)
plan = engine.serialize()
savepth = './tensorrt/depthformer.trt'
with open(savepth, "wb") as fw:
fw.write(plan)
print('Finish conversion')
if __name__ == '__main__':
engine = build_engine_from_onnx(args.onnx)
execute command:
python3 <this file> --int8
Please include:
- Exact steps/commands to build your repro
- Exact steps/commands to run your repro
- Full traceback of errors encountered