Inferring Yolo_v3.trt model in python

@Morganh yea inside deepstream its automatically convert the etlt model to trt engine and the same model working perfectly.But i need to inference with python script. so i converted the etlt trt engine in my system with the help of tlt-convertor and used converted trt engine in script but still i get false positives.

Can you run tlt-infer? The tlt-infer is the default way for inference.
Actually for TLT, it only supports two ways. One is tlt-infer, another is via deepstream.

yea i can run tlt-infer and also in deepstream.is there any way to infer with python script?

Sure, it can. Please debug your code.
I will also check it later.

Ok sure Thank you for your support

For preprocessing, please follow Discrepancy between results from tlt-infer and trt engine - #8 by Morganh

@Morganh I did the same.But still am getting false positives. I don’t know what’s the mistake i did
def process_image(self,arr):

    image = Image.fromarray(np.uint8(arr))

    image_resized = image.resize(size=(self.model_w, self.model_h))
    img_np = np.array(image_resized,dtype=np.float)
    # HWC -> CHW
    img_np = img_np.transpose((2, 0, 1))
    img_np = img_np.ravel()

No, it is not the same. You already change aspect ratio. Please do not change aspect ratio.

1 Like

@Morganh
please help me.I didn’t change aspect ratio. but still getting false positives

class PreprocessYOLO(object):
“”“A simple class for loading images with PIL and reshaping them to the specified
input resolution for YOLOv3-608.
“””

def __init__(self, yolo_input_resolution):
    """Initialize with the input resolution for YOLOv3, which will stay fixed in this sample.
    Keyword arguments:
    yolo_input_resolution -- two-dimensional tuple with the target network's (spatial)
    input resolution in HW order
    """
    self.yolo_input_resolution = yolo_input_resolution

def process(self, input_image_path):
    """Load an image from the specified input path,
    and return it together with a pre-processed version required for feeding it into a
    YOLOv3 network.
    Keyword arguments:
    input_image_path -- string path of the image to be loaded
    """
    image_raw, image_resized = self._load_and_resize(input_image_path)
    image_preprocessed = self._shuffle_and_normalize(image_resized)
    return image_raw, image_preprocessed

def _load_and_resize(self, input_image_path):
    """Load an image from the specified path and resize it to the input resolution.
    Return the input image before resizing as a PIL Image (required for visualization),
    and the resized image as a NumPy float array.
    Keyword arguments:
    input_image_path -- string path of the image to be loaded
    """

    image_raw = Image.open(input_image_path)
    # Expecting yolo_input_resolution in (height, width) format, adjusting to PIL
    # convention (width, height) in PIL:
    new_resolution = (
        self.yolo_input_resolution[1],
        self.yolo_input_resolution[0])
    image_resized = image_raw.resize(
        new_resolution, resample=Image.BICUBIC)
    image_resized = np.array(image_resized, dtype=np.float32, order='C')
    return image_raw, image_resized

def _shuffle_and_normalize(self, image):
    """Normalize a NumPy array representing an image to the range [0, 1], and
    convert it from HWC format ("channels last") to NCHW format ("channels first"
    with leading batch dimension).
    Keyword arguments:
    image -- image as three-dimensional NumPy float array, in HWC format
    """
    image /= 255.0
    # HWC to CHW format:
    image = np.transpose(image, [2, 0, 1])
    # CHW to NCHW format
    image = np.expand_dims(image, axis=0)
    # Convert the image to row-major order, also known as "C order":
    image = np.array(image, dtype=np.float32, order='C')
    return image

You changed the aspect ratio in your below original code.

def process_image(self,arr):
    
    # image = Image.fromarray(np.uint8(arr))
    
    #image_resized = image.resize(size=(self.model_w, self.model_h), resample=Image.BILINEAR)
    image_resized=cv2.resize(arr,(self.model_w, self.model_h))
    img_np = image_resized.astype(np.float32)
    # HWC -> CHW
    img_np = img_np.transpose((2, 0, 1))
    print(img_np)
    # Normalize to [0.0, 1.0] interval (expected by model)
    # img_np = (1.0 / 255.0) * img_np
    img_np = img_np.ravel()
    return img_np

Please try to follow the steps I mentioned above.

Yea tried this.Getting negative values in bbox coordinates with high confidence score.

[array([442.91855, 561.6644 , 995.78345, 944.9944 ], dtype=float32), array([ 431.9024 , -49.264008, 1008.4338 , 469.98044 ], dtype=float32),

What did you modify in your original code? Can you share the latest one?

As per your suggestion i changed preprocessing code only.
trt_loader_yolonew.py (11.2 KB)

Can you mention what has been changed in def process_image?
I did not see any change.

@Morganh
Actually my model input size is 1472X960. So if resize the image without changing aspect ratio the resized image size is 1472X828. then how can i feed this image to inference.

You can consider it as padding. Please refer to the steps in Discrepancy between results from tlt-infer and trt engine - #8 by Morganh again.

1 Like

@Morganh These steps are correct?
image = cv2.imread(imname,cv2.COLOR_BGR2RGB)

image_resized=imutils.resize(image,width=self.model_w)
new_image = np.zeros((self.model_h,self.model_w,image_resized.shape[2]), np.uint8)

new_image[0:image_resized.shape[0],0:image_resized.shape[1],:]=image_resized
img_np = new_image.astype(np.float32)

HWC → CHW

img_np = img_np.transpose((2, 0, 1))
img_np = preprocess_input(img_np)
img_np = img_np.ravel()

Can you review your code by yourself? My suggestion is already mentioned above.

1 Like

yes Morganh.I doublechecked my code and i changed the preprocessing as you suggest.But still i am getting negative values in bbox coordinates.can you please tell me why the bbox coordinates are come in negative index.
Here i attached the preprocessing steps

def _preprocess_yolo(self,img, letter_box=True):
“”"Preprocess an image before TRT YOLO inferencing.

# Args
    img: int8 numpy array of shape (img_h, img_w, 3)
    input_shape: a tuple of (H, W)
    letter_box: boolean, specifies whether to keep aspect ratio and
                create a "letterboxed" image for inference

# Returns
    preprocessed img: float32 numpy array of shape (3, H, W)
"""
input_shape = (self.model_h,self.model_w)
if letter_box:
    img_h, img_w, _ = img.shape
    new_h, new_w = input_shape[0], input_shape[1]
    offset_h, offset_w = 0, 0
    if (new_w / img_w) <= (new_h / img_h):
        new_h = int(img_h * new_w / img_w)
        offset_h = (input_shape[0] - new_h) // 2
    else:
        new_w = int(img_w * new_h / img_h)
        offset_w = (input_shape[1] - new_w) // 2
    resized = cv2.resize(img, (new_w, new_h))
    img = np.full((input_shape[0], input_shape[1], 3), 127, dtype=np.uint8)
    img[offset_h:(offset_h + new_h), offset_w:(offset_w + new_w), :] = resized
else:
    img = cv2.resize(img, (input_shape[1], input_shape[0]))

img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
img = img.astype(np.float32)
img = img.transpose((2, 0, 1))
img = preprocess_input(img)
return img.ravel()

In your code, the ratio can be min(self.model_w/float(img_w), self.model_h/float(img_h))
Then,

new_w = int(round(img_w * ratio))
new_h = int(round(img_h*ratio))

Please try to resize the image via PIL instead of cv.

im = img.resize((new_w, new_h), Image.ANTIALIAS)

inf_img = Image.new(‘RGB’,(self.model_w, self.model_h))
inf_img.paste(im, (0, 0))