Description
I want to use dyamic batchsize and shape in tensorrt.
I add two profile from onnx to engine, one profile is the batchsize=1, and the other batchsize=4, below is onnx to engine code:
def build_engine(onnx_path, using_half, batch_size=1, dynamic_input=True):
trt.init_libnvinfer_plugins(None, '')
with trt.Builder(TRT_LOGGER) as builder, builder.create_network(EXPLICIT_BATCH) as network, trt.OnnxParser(network, TRT_LOGGER) as parser:
builder.max_batch_size = 4 # always 1 for explicit batch
builder.max_workspace_size = GiB(1)*5
config = builder.create_builder_config()
config.max_workspace_size = GiB(1)*5
if using_half:
config.set_flag(trt.BuilderFlag.FP16)
# Load the Onnx model and parse it in order to populate the TensorRT network.
with open(onnx_path, 'rb') as model:
if not parser.parse(model.read()):
print ('ERROR: Failed to parse the ONNX file.')
for error in range(parser.num_errors):
print (parser.get_error(error))
return None
if dynamic_input:
profile = builder.create_optimization_profile()
profile.set_shape(network.get_input(0).name, min=(1,1,32,10),opt=(1,1,32,420),max=(1,1,32,1000))
config.add_optimization_profile(profile)
profile1 = builder.create_optimization_profile()
profile1.set_shape(network.get_input(0).name, min=(4,1,32,10),opt=(4,1,32,420),max=(4,1,32,1000))
config.add_optimization_profile(profile1)
return builder.build_engine(network, config)
when inference with “profile”,which batchsize=1, I set:
context = engine.create_execution_context
context.active_optimization_profile = 0
context.set_binding_shape(0, img.shape) # img.shape=(1, 1, 32, 208)
cuda.memcpy_htod_async(d_input, img, self.stream)
self.context.execute_async_v2(bindings=bindings, stream_handle=self.stream.handle)
cuda.memcpy_dtoh_async(outputs, d_output, self.stream)
Everything is fine.
But when I want to use the “profile1”, which batchsize =4,
context = engine.create_execution_context
context.active_optimization_profile = 1
context.set_binding_shape(2, img.shape) # img.shape=(4, 1, 32, 208)
cuda.memcpy_htod_async(d_input, img, self.stream)
self.context.execute_async_v2(bindings=bindings, stream_handle=self.stream.handle)
cuda.memcpy_dtoh_async(outputs, d_output, self.stream)
it show:
[TensorRT] INFO: Current optimization profile is: 0. Please ensure there are no enqueued operations pending in this context prior to switching profiles
[TensorRT] ERROR: myelin/myelinRunner.cpp (372) - Myelin Error in execute: 68 (myelinCudaError : CUDA error 700 enqueueing async copy.
)
[TensorRT] ERROR: FAILED_EXECUTION: std::exception
0%| | 0/3059 [00:01<?, ?it/s]
Traceback (most recent call last):
File “crnn_pth2engine/crnn_trt_batch.py”, line 164, in
preds,length, t_predict = crnn_handle.predict(img,batch_size)
File “crnn_pth2engine/crnn_trt_batch.py”, line 102, in predict
cuda.memcpy_dtoh_async(outputs, d_output, self.stream)
pycuda._driver.LogicError: cuMemcpyDtoHAsync failed: an illegal memory access was encountered
PyCUDA WARNING: a clean-up operation failed (dead context maybe?)
cuMemFree failed: an illegal memory access was encountered
PyCUDA WARNING: a clean-up operation failed (dead context maybe?)
cuMemFree failed: an illegal memory access was encountered
PyCUDA WARNING: a clean-up operation failed (dead context maybe?)
cuStreamDestroy failed: an illegal memory access was encountered
PyCUDA ERROR: The context stack was not empty upon module cleanup.
A context was still active when the context stack was being
cleaned up. At this point in our execution, CUDA may already
have been deinitialized, so there is no way we can finish
cleanly. The program will be aborted now.
Use Context.pop() to avoid this problem.
Environment
TensorRT Version: TensorRT-7.2.3.4
GPU Type: T4
Nvidia Driver Version: 460.73.01
CUDA Version: 10.2
CUDNN Version: 8.4.0
Operating System + Version: centos7
Python Version (if applicable): python3.6.13
TensorFlow Version (if applicable): none
PyTorch Version (if applicable): 1.6
Baremetal or Container (if container which image + tag): none
How can I use “profile1” to do inference? if I need to set current profile is “profile1”, and how?
Thank you for your help!