Description
Hi,
I am utilizing YOLOV4 detection models for my project. I use AlexeyAB’s darknet fork for training custom YOLOv4 detection models. For TensorRT conversion, I use Tianxiaomo’s pytorch-YOLOv4 to parse darknet models to Pytorch and then later to ONNX using torch.onnx.export.
The issue is that when I use the TensorRT model for batch size 1 inference, there is no problem but for batch size > 1, depending on the TensorRT conversion method, the inference results from the model is different.
Method#1 : Conversion using trtexec
trtexec --onnx=onnx_models/yolov4tiny_2_3_416_416_static.onnx --maxBatch=2 --fp16 --saveEngine=trt_models/trtexec/yolov4tiny2-416.trt
For testing, I created NumPy batch image (batch size 2) from a single image. However, the inference results are not similar. There are 5 objects are detected in the former but 11 objects are detected in the latter. The detection outputs for the first frame and the second frame is not the same despite being the same image.
Method#2 : Conversion using TensorRT Python API
EXPLICIT_BATCH = []
if trt.__version__[0] >= '7':
EXPLICIT_BATCH.append(
1 << (int)(trt.NetworkDefinitionCreationFlag.EXPLICIT_BATCH))
print(EXPLICIT_BATCH)
def build_engine(onnx_file_path, engine_file_path, verbose=False, batch_size=1):
"""Takes an ONNX file and creates a TensorRT engine."""
TRT_LOGGER = trt.Logger(trt.Logger.VERBOSE) if verbose else trt.Logger()
with trt.Builder(TRT_LOGGER) as builder, builder.create_network(*EXPLICIT_BATCH) as network, trt.OnnxParser(network,
TRT_LOGGER) as parser:
if trt.__version__[0] >= '8':
config = builder.create_builder_config()
config.max_workspace_size = 1 << 28
builder.max_batch_size = batch_size
config.flags = 1 << int(trt.BuilderFlag.FP16)
# config.flags = strict_type_constraints << int(trt.BuilderFlag.STRICT_TYPES)
else:
builder.max_workspace_size = 1 << 28
builder.max_batch_size = batch_size
builder.fp16_mode = True
# builder.strict_type_constraints = True
# Parse model file
print('Loading ONNX file from path {}...'.format(onnx_file_path))
with open(onnx_file_path, 'rb') as model:
print('Beginning ONNX file parsing')
if not parser.parse(model.read()):
print('ERROR: Failed to parse the ONNX file.')
for error in range(parser.num_errors):
print(parser.get_error(error))
return None
if trt.__version__[0] >= '7':
# The actual yolo*.onnx is generated with batch size 64.
# Reshape input to batch size 1
shape = list(network.get_input(0).shape)
print(shape)
shape[0] = batch_size
network.get_input(0).shape = shape
print('Completed parsing of ONNX file')
print('Building an engine; this may take a while...')
if trt.__version__[0] >= '8':
engine = builder.build_engine(network, config)
else:
engine = builder.build_cuda_engine(network)
print('Completed creating engine')
try:
with open(engine_file_path, 'wb') as f:
f.write(engine.serialize())
return engine
except:
traceback.print_exc()
TensorRT model, converted from python API produces different results from trtexec. Python API TensorRT model produces 11 detections for the first image in the batch image (batch size is 2) but empty detection results for the latter similar to this issue by @thomallain. The detection outputs here only concern the first frame. All the detector arrays for the second frame are equal to zeros.
Environment
TensorRT Version: 8.0.0.3
GPU Type: GTX 1070
Nvidia Driver Version: 465.31
CUDA Version: 8.0.0.3
CUDNN Version: 8.2.2
Operating System + Version: Ubuntu 18.04
Python Version (if applicable): 3,6
TensorFlow Version (if applicable):
PyTorch Version (if applicable): 1.5
Baremetal or Container (if container which image + tag): Baremetal
Relevant Files
File Structure
.
├── onnx_models
│ ├── yolov4_-1_3_416_416_dynamic.onnx
│ ├── yolov4_1_3_416_416_static.onnx
│ └── yolov4_2_3_416_416_static.onnx
├── TownCenter-008.jpg
├── trt_convert.py
├── trt_models
│ ├── pythonapi
│ │ └── yolov4tiny2-416.trt
│ └── trtexec
│ └── yolov4tiny2-416.trt
└── yolov4-tiny.py
Relevant scripts, onnx models, and converted TensorRT models can be download via this Google Drive: issue - Google Drive
Steps To Reproduce
- Conversion via trtexec can be done with the aforementioned method.
- Conversion with python api can be done with trt_convert.py by passing desired onnx model as parameter.
- Inference can be done with yolov4-tiny.py by passing desired converted TensorRT model as parameter.
python yolov4-tiny.py --trt trt_models/trtexec/yolov4tiny2-416.trt --conf_threshold 0.3 --nms_threshold 0.4 --num_classes 1 --batch_size 2
Could you share with me some suggestions on how to fix this error so that batch inference runs as expected?
Thanks in advance!
Best Regards,
Htut