TensorRT Batch Inferences : empty outputs

thomallain · December 2, 2020, 1:59pm

Description

Hi all,

I have been working in Python with YoloV3 and YoloV4 on Jetson Nano and Jetson Xavier NX for a while, with a batch size of 1 and I never had an issue there.

Now for a project I am trying to use YoloV4 (yolov4-288) with multiple inputs, ie a batch size of 2.

I have been able to properly convert YoloV4 to a ONNX file with a static input dimension of [64, 3, 288, 288], and then to convert it to a TensorRT file with a input size of [2, 3, 288, 288].

When I run the inferences with the inputs being the frames of two videos, the engine output is good for only the first frame, while it is completely null for the second frame.

I would like to know what I can do to make the inferences working for the second frame.

Environment

TensorRT Version: 7.1.3
GPU Type: Jetson Xavier NX
CUDA Version: 10.2
Jetpack: 4.4.1
Python Version (if applicable): 3.6.9
TensorFlow Version (if applicable): 1.15
Baremetal or Container (if container which image + tag): Baremetal

Relevant Files

All files including the yolov4.onnx, yolov4.trt, the conversion scripts and a script test_batches.py with the video files to test on, are on my Google Drive : YoloV4_XavierNX - Google Drive

Steps To Reproduce

The script test_batches.py show the number and indexes of detections from the outputs of the trt engine. Here the trt engine get the frames of two videos. It will raise an error after the first batch processing (if you comment line 285 it will process the following frames).

The detection outputs here only concern the first frame. All the detection arrays for the second frame are equal to zeros. And if you change the second frame by replacing with Nothing or with the first frame then the results are the same.

AakankshaS · December 4, 2020, 6:00am

Hi @thomallain,
You are currently using static shapes.
In order to make this work for a different batch size, you should use dynamic shapes

Thanks!

thomallain · December 11, 2020, 7:31am

Hi @AakankshaS,

I have followed this topic to use the dynamic shapes ONNX to TensorRT with dynamic batch size in Python

I have changed my conversion from Darknet to ONNX to get a ONNX model with a N batch size (like in the other topic) yolov4-416.zip - Google Drive

To convert my ONNX file to TensorRT, I have used the command (which works):

trtexec --onnx=yolov4-416.onnx --fp16 --workspace=4096 --explicitBatch --verbose --shapes=000_net:2x3x416x416 --saveEngine=yolov4-416.trt --optShapes=000_net:2x3x416x416 --maxShapes=000_net:4x3x416x416 --minShapes=000_net:1x3x416x416

Now during runtime, I got the following :

engine.get_binding_shape(0) -> (-1, 3, 416,416)

```
engine.max_batch_size -> 1
```
```
engine.num_optimization_profiles  -> 1
```

So according to the other topic, the input shape and max_batch_size are correct. But my engine only considers that it has one optimization profile (with the -1 dimension), and not even the one under --shapes from the trtexec command. And because of that it, it tries to allocates negative memory space.

Why my optimization profiles from the trtexec command are not present ? And is it the right way to make the dynamic shape for the batch inference ?

edgetinker · January 6, 2021, 8:50pm

I have also followed the dynamic shapes process, for batch size 3, and am also seeing missing inferences for the first two images. My output:

[array([0., 0., 0., …, 0., 0., 0.], dtype=float32), array([0., 0., 0., …, 0., 0., 0.], dtype=float32), array([-0.6542635, 0.774671 , -2.804221 , …, 0. , 0. ,
0. ], dtype=float32)]

AakankshaS · April 1, 2021, 4:03am

Hi @thomallain ,
Apologies for delayed response, are you still facing the issue?

Htut · July 7, 2021, 8:21am

@thomallain , Hi, I am also facing the same issue. Have you found the solution to this problem?

NVES · July 7, 2021, 8:37am

Hi,
This looks like a Jetson issue. We recommend you to raise it to the respective platform from the below link

Thanks!

Htut · July 7, 2021, 9:47am

@NVES , Hi. In my case, I generated TensorRT model by using trtexec on GTX 1070 so I think this particular issue also exist on devices aside from Jetson systems.

sachint.2306 · July 18, 2024, 9:42pm

I am also face the exact same issue where running inference on a batch results in only first tensor of the batch to be populated leaving others to be null

Topic		Replies	Views
TensorRT Batch Inference: different results TensorRT	4	4205	December 1, 2021
Convert YOLOv7 QAT model to TensorRT engine failure Jetson AGX Xavier yolo	9	1055	June 21, 2023
Converting yolov4 onnx model to TensorRT for multi batch input TensorRT cudnn	3	645	January 31, 2024
Yolov6 Slow inference speed on the Nvidia Jetson NX board Jetson Xavier NX yolo	8	1612	August 24, 2022
Darknet YoloV4-tiny model in TensorRT 8 inference TensorRT tensorrt , onnx	7	2196	October 22, 2021
Jetson-Inference predictions differ from e.g. tensorflow predictions Jetson Nano jetson-inference	4	862	November 17, 2021
ONNX to TensorRT with dynamic batch size in Python TensorRT tensorrt , onnx	4	6253	October 12, 2021
Running TensorRT on Yolov3 (TF 2.0 implementation) TensorRT	3	1693	September 27, 2020
How to infer using tensorRT on jetson nano? Jetson Nano tensorrt , deep-learning	4	1014	October 15, 2021
Yolov5 + TensorRT Jetson Nano tensorrt , yolo	4	5137	April 29, 2022

TensorRT Batch Inferences : empty outputs

Description

Environment

Relevant Files

Steps To Reproduce

Related topics