Description
I have exported yolov8n-seg.pt to ONNX format
, and later, using the trtexec
command-line tool, I converted the ONNX model into a TensorRT engine
.
The shape of the outputs exhibited by the ONNX model is as follows:
output0: float32(1, 6, N) → Bounding Boxes
output1: float32(1, 160, 160, N) → Masks
Here:
* The first dimension in both outputs represents the batch size.
* The last dimension in both outputs represents the number of detection, with the same number of
corresponding masks.
* The second dimension in output0 consists of 4 bounding box coordinates in X1, Y1, X2, Y2 format, a class
confidence score, and the corresponding class ID.
* The second and third dimensions in output1 represent the size of each mask, i.e., 160x160.
The shape of the outputs exhibited by the TensorRT engine is as follows:
- Name: output0
- Shape: (1, 6, -1)
- Dtype: FLOAT
- Type: Device (Tensor)
- Name: output1
- Shape: (1, 160, 160, -1)
- Dtype: FLOAT
- Type: Device (Tensor)
Here, -1 is dynamic and solely dependent on the input data.
While allocating memory for output, I’m getting following error below-
OverflowError: can't convert negative value to unsigned int
I found following two approaches given below that might work-
* Deferred output allocation: IOutputAllocator interface
* Pre-allocate memory: getMaxOutputSize method (Python API for Tensor-Rt doesn’t have a direct method
called getMaxOutputSize)
But the problem is, I don’t have any idea how to implement these changes! Looking forward to much insightful response.
Environment
TensorRT Version: 8.6.1
GPU Type: Nvidia GeForce GTX 1050 Ti, 4GB
Nvidia Driver Version: 555.42.02
CUDA Version: 11.8
CUDNN Version: 8.9.0
Operating System + Version: Ubuntu 22.04.4 LTS
Python Version (if applicable): 3.10.11
TensorFlow Version (if applicable):
PyTorch Version (if applicable):
Baremetal or Container (if container which image + tag):