Inferring Yolo_v3.trt model in python

Yea tried this.Getting negative values in bbox coordinates with high confidence score.

[array([442.91855, 561.6644 , 995.78345, 944.9944 ], dtype=float32), array([ 431.9024 , -49.264008, 1008.4338 , 469.98044 ], dtype=float32),

What did you modify in your original code? Can you share the latest one?

As per your suggestion i changed preprocessing code only.
trt_loader_yolonew.py (11.2 KB)

Can you mention what has been changed in def process_image?
I did not see any change.

@Morganh
Actually my model input size is 1472X960. So if resize the image without changing aspect ratio the resized image size is 1472X828. then how can i feed this image to inference.

You can consider it as padding. Please refer to the steps in Discrepancy between results from tlt-infer and trt engine - #8 by Morganh again.

1 Like

@Morganh These steps are correct?
image = cv2.imread(imname,cv2.COLOR_BGR2RGB)

image_resized=imutils.resize(image,width=self.model_w)
new_image = np.zeros((self.model_h,self.model_w,image_resized.shape[2]), np.uint8)

new_image[0:image_resized.shape[0],0:image_resized.shape[1],:]=image_resized
img_np = new_image.astype(np.float32)

HWC → CHW

img_np = img_np.transpose((2, 0, 1))
img_np = preprocess_input(img_np)
img_np = img_np.ravel()

Can you review your code by yourself? My suggestion is already mentioned above.

1 Like

yes Morganh.I doublechecked my code and i changed the preprocessing as you suggest.But still i am getting negative values in bbox coordinates.can you please tell me why the bbox coordinates are come in negative index.
Here i attached the preprocessing steps

def _preprocess_yolo(self,img, letter_box=True):
“”"Preprocess an image before TRT YOLO inferencing.

# Args
    img: int8 numpy array of shape (img_h, img_w, 3)
    input_shape: a tuple of (H, W)
    letter_box: boolean, specifies whether to keep aspect ratio and
                create a "letterboxed" image for inference

# Returns
    preprocessed img: float32 numpy array of shape (3, H, W)
"""
input_shape = (self.model_h,self.model_w)
if letter_box:
    img_h, img_w, _ = img.shape
    new_h, new_w = input_shape[0], input_shape[1]
    offset_h, offset_w = 0, 0
    if (new_w / img_w) <= (new_h / img_h):
        new_h = int(img_h * new_w / img_w)
        offset_h = (input_shape[0] - new_h) // 2
    else:
        new_w = int(img_w * new_h / img_h)
        offset_w = (input_shape[1] - new_w) // 2
    resized = cv2.resize(img, (new_w, new_h))
    img = np.full((input_shape[0], input_shape[1], 3), 127, dtype=np.uint8)
    img[offset_h:(offset_h + new_h), offset_w:(offset_w + new_w), :] = resized
else:
    img = cv2.resize(img, (input_shape[1], input_shape[0]))

img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
img = img.astype(np.float32)
img = img.transpose((2, 0, 1))
img = preprocess_input(img)
return img.ravel()

In your code, the ratio can be min(self.model_w/float(img_w), self.model_h/float(img_h))
Then,

new_w = int(round(img_w * ratio))
new_h = int(round(img_h*ratio))

Please try to resize the image via PIL instead of cv.

im = img.resize((new_w, new_h), Image.ANTIALIAS)

inf_img = Image.new(‘RGB’,(self.model_w, self.model_h))
inf_img.paste(im, (0, 0))

Thank you Morganh
Yeah I did the same .but no luck.still getting negative values.

img_w =arr.size[0]
img_h = arr.size[1]

ratio = min(self.model_w/float(img_w), self.model_h/float(img_h))

new_w = int(round(img_w * ratio))
new_h = int(round(img_h*ratio))

im = arr.resize((new_w, new_h), Image.ANTIALIAS)

inf_img = Image.new(‘RGB’,(self.model_w, self.model_h))
inf_img.paste(im, (0, 0))
inf_img = np.array(inf_img).astype(np.float32)
inference_input = preprocess_input(inf_img.transpose(2, 0, 1))
inference_input = inference_input.ravel()

Can you try inference_input = inf_img.transpose(2, 0, 1) instead?

yea tried. results are different but still getting negative values

for the previous one with preproces_input. i got this results
[[ 513 -144 968 960]
[ 372 1157 786 1982]
[ 376 309 839 1234]
[ 424 1017 897 2002]
[ -55 831 454 2059]
[ 897 338 1237 1644]
[ 338 -174 875 983]
[ 807 854 1284 2054]
[ 822 -180 1273 990]]

and now without preprocess_input got this
[[ 475 1154 897 1989]
[ 355 298 800 1236]
[ 283 -141 719 965]
[ 339 1007 826 2008]
[ 9 227 351 1480]
[ 389 -168 931 982]
[ -10 -159 371 996]
[ 882 196 1243 1492]
[ 813 823 1280 2071]]

In both negative indices are there

What is the meaning of above result? What did you print?

Bounding box indices[ [x1,y1,x2,y2],[…]…] of the detections.

Please modify your code to

x_scale = float(img_shape[1]) / float(model_w)
y_scale = float(img_shape[0]) / float(model_h)
max_scale = max(x_scale,y_scale)

for i in range(p_keep_count[0]):
assert(p_classes[i] < len(analysis_classes))
if p_scores[i]>threshold:

        x1 = int(np.round(p_bboxes[i][0]*max_scale))
        y1 = int(np.round(p_bboxes[i][1]*max_scale)) 
        x2 = int(np.round(p_bboxes[i][2]*max_scale))
        y2 = int(np.round(p_bboxes[i][3]*max_scale))

I run a standalone code against one KITTI image (004987.png)
It can detect bbox as below.
[625 176 646 191]
[784 179 841 211]
[438 173 471 185]

1 Like

yea that’s a change in postprocessing. but i got these results before postprocessing.

i am getting the above results from here
detection_out = self.do_inference(
self.context, bindings=bindings, inputs=inputs, outputs=outputs, stream=stream
)

Please try the same image as mine , KITTI image (004987.png).
I can run inference well against it.

ok. could you please check my model in your script. Shall i attach my model and input image here.