Description
I trained the yolov4 model and used this path: darknet → onnx → tensorrt.
My device is JETSON XAVIER NX.
I use nvcr.io/nvidia/l4t-tensorflow:r32.4.3-tf2.2-py3.
Issue
If I use batch_size = 1 it works fine.
If I use batch_size> 1, the prediction only works for the first batch, the next ones are zero-padded.
Environment
TensorRT Version: 7.1.3.0
GPU Type: JETSON XAVIER NX
Nvidia Driver Version:
CUDA Version: 10.2
CUDNN Version: 8
Operating System + Version: Linux
Python Version (if applicable): 3.6.9
TensorFlow Version (if applicable): 2.2
PyTorch Version (if applicable): -
Baremetal or Container (if container which image + tag): nvcr.io/nvidia/l4t-tensorflow:r32.4.3-tf2.2-py3
Steps To Reproduce
Please include:
- I I trained yolov4 model with darknet. (cfg, weights)
- I used pytorch-YOLOv4 repository for conversion: darknet → onnx:
python demo_darknet2onnx.py yolov4-tiny-3l.cfg yolov4-tiny-3l_final.weights img.jpg 16
The resulting model: yolov4_16_3_608_608_static.onnx
- Then I used trtexec to generate the engine and make predictions in my code. Everything works fine, but only if batch_size = 1. I have problems when batch_size=16 and tried options:
- with --batch=16 param
trtexec --onnx=yolov4_16_3_608_608_static.onnx --saveEngine=engine_b16_fp32.engine --workspace=4096 --buildOnly --batch=16
The resulting model: engine_b16_fp32.engine
Outputs from inference:
[[(0, 0.9952153, 877, 848, 113, 58), (0, 0.99335945, 1199, 171, 103, 80), (0, 0.99015826, 123, 147, 142, 54), (0, 0.98760045, 1483, 315, 73, 56), (0, 0.98437995, 134, 511, 74, 54)], [], [], [], [], [], [], [], [], [], [], [], [], [], [], []]
- with --explicitBatch option
trtexec --onnx=yolov4_16_3_608_608_static.onnx --saveEngine=engine_uavvaste_yolov4_tiny_3l_608_b16_fp32.engine --workspace=4096 --buildOnly --explicitBatch
Outputs from inference:
[TensorRT] ERROR: Parameter check failed at: engine.cpp::enqueue::387, condition: batchSize > 0 && batchSize <= mEngine.getMaxBatchSize(). Note: Batch size was: 16, but engine max batch size was: 1
