Yea tried this.Getting negative values in bbox coordinates with high confidence score.
[array([442.91855, 561.6644 , 995.78345, 944.9944 ], dtype=float32), array([ 431.9024 , -49.264008, 1008.4338 , 469.98044 ], dtype=float32),
Yea tried this.Getting negative values in bbox coordinates with high confidence score.
[array([442.91855, 561.6644 , 995.78345, 944.9944 ], dtype=float32), array([ 431.9024 , -49.264008, 1008.4338 , 469.98044 ], dtype=float32),
What did you modify in your original code? Can you share the latest one?
Can you mention what has been changed in def process_image
?
I did not see any change.
@Morganh
Actually my model input size is 1472X960. So if resize the image without changing aspect ratio the resized image size is 1472X828. then how can i feed this image to inference.
You can consider it as padding. Please refer to the steps in Discrepancy between results from tlt-infer and trt engine - #8 by Morganh again.
@Morganh These steps are correct?
image = cv2.imread(imname,cv2.COLOR_BGR2RGB)
image_resized=imutils.resize(image,width=self.model_w)
new_image = np.zeros((self.model_h,self.model_w,image_resized.shape[2]), np.uint8)
new_image[0:image_resized.shape[0],0:image_resized.shape[1],:]=image_resized
img_np = new_image.astype(np.float32)
img_np = img_np.transpose((2, 0, 1))
img_np = preprocess_input(img_np)
img_np = img_np.ravel()
Can you review your code by yourself? My suggestion is already mentioned above.
yes Morganh.I doublechecked my code and i changed the preprocessing as you suggest.But still i am getting negative values in bbox coordinates.can you please tell me why the bbox coordinates are come in negative index.
Here i attached the preprocessing steps
def _preprocess_yolo(self,img, letter_box=True):
“”"Preprocess an image before TRT YOLO inferencing.
# Args
img: int8 numpy array of shape (img_h, img_w, 3)
input_shape: a tuple of (H, W)
letter_box: boolean, specifies whether to keep aspect ratio and
create a "letterboxed" image for inference
# Returns
preprocessed img: float32 numpy array of shape (3, H, W)
"""
input_shape = (self.model_h,self.model_w)
if letter_box:
img_h, img_w, _ = img.shape
new_h, new_w = input_shape[0], input_shape[1]
offset_h, offset_w = 0, 0
if (new_w / img_w) <= (new_h / img_h):
new_h = int(img_h * new_w / img_w)
offset_h = (input_shape[0] - new_h) // 2
else:
new_w = int(img_w * new_h / img_h)
offset_w = (input_shape[1] - new_w) // 2
resized = cv2.resize(img, (new_w, new_h))
img = np.full((input_shape[0], input_shape[1], 3), 127, dtype=np.uint8)
img[offset_h:(offset_h + new_h), offset_w:(offset_w + new_w), :] = resized
else:
img = cv2.resize(img, (input_shape[1], input_shape[0]))
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
img = img.astype(np.float32)
img = img.transpose((2, 0, 1))
img = preprocess_input(img)
return img.ravel()
In your code, the ratio can be min(self.model_w/float(img_w), self.model_h/float(img_h))
Then,
new_w = int(round(img_w
*
ratio))
new_h = int(round(img_h*
ratio))
Please try to resize the image via PIL instead of cv.
im = img.resize((new_w, new_h), Image.ANTIALIAS)
inf_img = Image.new(‘RGB’,(self.model_w, self.model_h))
inf_img.paste(im, (0, 0))
Thank you Morganh
Yeah I did the same .but no luck.still getting negative values.
img_w =arr.size[0]
img_h = arr.size[1]
ratio = min(self.model_w/float(img_w), self.model_h/float(img_h))
new_w = int(round(img_w * ratio))
new_h = int(round(img_h*ratio))
im = arr.resize((new_w, new_h), Image.ANTIALIAS)
inf_img = Image.new(‘RGB’,(self.model_w, self.model_h))
inf_img.paste(im, (0, 0))
inf_img = np.array(inf_img).astype(np.float32)
inference_input = preprocess_input(inf_img.transpose(2, 0, 1))
inference_input = inference_input.ravel()
Can you try inference_input = inf_img.transpose(2, 0, 1) instead?
yea tried. results are different but still getting negative values
for the previous one with preproces_input. i got this results
[[ 513 -144 968 960]
[ 372 1157 786 1982]
[ 376 309 839 1234]
[ 424 1017 897 2002]
[ -55 831 454 2059]
[ 897 338 1237 1644]
[ 338 -174 875 983]
[ 807 854 1284 2054]
[ 822 -180 1273 990]]
and now without preprocess_input got this
[[ 475 1154 897 1989]
[ 355 298 800 1236]
[ 283 -141 719 965]
[ 339 1007 826 2008]
[ 9 227 351 1480]
[ 389 -168 931 982]
[ -10 -159 371 996]
[ 882 196 1243 1492]
[ 813 823 1280 2071]]
In both negative indices are there
What is the meaning of above result? What did you print?
Bounding box indices[ [x1,y1,x2,y2],[…]…] of the detections.
Please modify your code to
x_scale = float(img_shape[1]) / float(model_w) y_scale = float(img_shape[0]) / float(model_h) max_scale = max(x_scale,y_scale)
for i in range(p_keep_count[0]):
assert(p_classes[i] < len(analysis_classes))
if p_scores[i]>threshold:x1 = int(np.round(p_bboxes[i][0]*max_scale)) y1 = int(np.round(p_bboxes[i][1]*max_scale)) x2 = int(np.round(p_bboxes[i][2]*max_scale)) y2 = int(np.round(p_bboxes[i][3]*max_scale))
I run a standalone code against one KITTI image (004987.png)
It can detect bbox as below.
[625 176 646 191]
[784 179 841 211]
[438 173 471 185]
yea that’s a change in postprocessing. but i got these results before postprocessing.
i am getting the above results from here
detection_out = self.do_inference(
self.context, bindings=bindings, inputs=inputs, outputs=outputs, stream=stream
)
Please try the same image as mine , KITTI image (004987.png).
I can run inference well against it.
ok. could you please check my model in your script. Shall i attach my model and input image here.