Given an image of size 600x800 (height x width)
The following configs
Streammux
enable-padding=1
PGIE
symmetric-padding=1
maintain-aspect-ratio=1
scaling-filter=0
and network input shape 608 x 1088 (height x width) will result in the following image (ignore color channel for now)
This basically resizes and pads the input image and put it in the middle of a black image of size 608 x 1088. From my understanding, this DeepStream config is similar to the following Python code?
def letterbox(img, height=608, width=1088, color=(0, 0, 0)): # resize a rectangular image to a padded rectangular
shape = img.shape[:2] # shape = [height, width]
ratio = min(float(height)/shape[0], float(width)/shape[1])
new_shape = (round(shape[1] * ratio), round(shape[0] * ratio)) # new_shape = [width, height]
dw = (width - new_shape[0]) / 2 # width padding
dh = (height - new_shape[1]) / 2 # height padding
top, bottom = round(dh - 0.1), round(dh + 0.1)
left, right = round(dw - 0.1), round(dw + 0.1)
img = cv2.resize(img, new_shape, interpolation=cv2.INTER_AREA) # resized, no border
img = cv2.copyMakeBorder(img, top, bottom, left, right, cv2.BORDER_CONSTANT, value=color) # padded rectangular
return img, ratio, dw, dh
Here’s my question
- Does the Python implementation above match the DeepStream preprocessing? I need to know this because I’d have to rescale the predictions back to the original image input of size 600 x800 with
float net_width = 1088.f, net_height = 608.f;
float img_width = 800.f, img_height = 600.f;
float gain = min(net_width / img_width, net_height / img_height);
float pad_x = (net_width - img_width * gain) / 2;
float pad_y = (net_height - img_height * gain) / 2;
float x1 = (rect.x - pad_x) / gain;
float y1 = (rect.y - pad_y) / gain;
float x2 = (rect.x + rect.width - pad_x) / gain;
float y2 = (rect.y + rect.height - pad_y) / gain;
float width = x2 - x1;
float height = y2 - y1;
If it’s not the same, can you point to the example of rescaling the predictions back to its original scale.
- In DeepStream, with currently supported Streammux and PGIE configs, is it possible to pad and not put the resized image in the middle but the top left as below?
Environment
Architecture: x86_64
GPU: NVIDIA GeForce GTX 1650 Ti with Max-Q Design
NVIDIA GPU Driver: Driver Version: 495.29.05
DeepStream Version: 6.0 (running on docker image nvcr.io/nvidia/deepstream:6.0-devel)
TensorRT Version: v8001
Issue Type: Question
Thanks
Peeranat F.