TensorRT Inference

ozan.anli · March 7, 2023, 3:20pm

Description

Hello,

I have trained the DeTr model with a custom data set. Then I converted the DeTr model to a TensorRT model to achieve a faster inference time. As a test, I wrote a script inside the target system to infer only one image.

The problem is that the bounding boxes are not within a range of [0, 1] (see example below).
Do you know what could be the reason for this?

For reference:

preprocess:

transform = T.Compose([
    T.Resize((800,800)),
    T.ToTensor(),
    T.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
])

model deserialize:

with open(engine_file, "rb") as f:
    buf = f.read()
    engine = runtime.deserialize_cuda_engine(buf)
context = engine.create_execution_context()

memory allocation:

host_inputs  = []
cuda_inputs  = []
host_outputs_boxes = []
cuda_outputs_boxes = []
host_outputs_logits = []
cuda_outputs_logits = []
bindings = []

input_dimension = np.empty([
    batch_size, 
    channel_size, 
    image_size, 
    image_size], 
    dtype=PRECISION)
output_boxes_dimension = np.empty([
    batch_size, 
    n_predicitons, 
    4], 
    dtype=PRECISION)
output_logits_dimension = np.empty([
    batch_size, 
    n_predicitons, 
    n_CLASSES+1], 
    dtype=PRECISION)

input_batch = torch.from_numpy(input_dimension)
output_boxes = torch.from_numpy(output_boxes_dimension)
output_logits = torch.from_numpy(output_logits_dimension)

cuda_inputs = cuda.mem_alloc(input_batch.detach().numpy().nbytes)
cuda_outputs_boxes = cuda.mem_alloc(output_boxes.detach().numpy().nbytes)
cuda_outputs_logits = cuda.mem_alloc(output_logits.detach().numpy().nbytes)

bindings = [int(cuda_inputs), int(cuda_outputs_boxes), int(cuda_outputs_logits)]

image:

cv_image = cv2.imread(image_file)
cv_image = cv2.cvtColor(cv_image, cv2.COLOR_BGR2RGB)
pil_image = Image.fromarray(cv_image)
t_image = transform(pil_image).unsqueeze(0)
np_image = np.asarray(t_image).astype('float32')

inference:

cfx.push()
boxes = output_boxes_dimension
logits = output_logits_dimension
cuda.memcpy_htod_async(cuda_inputs, np_image, stream)
context.execute_async_v2(bindings, stream.handle, None)
cuda.memcpy_dtoh_async(boxes, cuda_outputs_boxes, stream)
cuda.memcpy_dtoh_async(logits, cuda_outputs_logits, stream)
stream.synchronize()
cfx.pop()

output:

print(boxes.shape)
print(logits.shape)
boxes

(1, 100, 4)
(1, 100, 12)
array([[[-15.440976  ,   1.9481723 ,  -2.0340009 ,   5.3164234 ],
        [ -0.8660751 ,  -3.8535829 ,  -2.3773873 ,  -3.1096535 ],
        [  2.653059  ,  -4.726863  ,  -1.538347  ,  -2.5069196 ],
...

AakankshaS · March 8, 2023, 9:23am

Hi @ozan.anli ,
Can you please help us with the environment details,
Also the onnx model and reproducible script for the same.

Thanks

Topic		Replies	Views
Inference with TensorRT after training Yolo v4 with TLT 3.0 TAO Toolkit	5	2162	April 10, 2021
YOLO v4 inference with TensorRT after training with TLT 3.0 TensorRT tensorrt , yolo , python	7	2675	September 27, 2021
Object Detection inference problem: image updates but bounding box (bbox) is fixed to bbox from first frame Jetson AGX Orin jetson-inference	1	80	May 7, 2025
Output from ONNX inference and trt inference are different Jetson TX2 tensorrt , tensorflow , nvbugs	5	984	May 14, 2021
I can't get result from TensorRT model TensorRT tensorrt	8	1164	May 31, 2022
How to get results (bonding boxes, class IDs, confidences) of Object detection (Yolo v5) in TensorRT Jetson Nano tensorrt	3	2698	December 2, 2021
YOLOv4 TensorRT inference results wayy off, but onnxruntime is not TensorRT tensorrt	6	1103	June 7, 2022
YOLOX TRT model giving multiple bounding boxes while inferencing TensorRT tensorrt , cuda , jetson	1	875	October 11, 2022
Tensor RT output tensor parser TAO Toolkit tensorrt , cuda	2	686	February 4, 2021
Incorrect inference in TensorRT compared to the Tensorflow inference TensorRT tensorrt	3	854	March 10, 2022

TensorRT Inference

Description

Related topics