Looking to clarify the images size options and how to specify, and the tensor shapes for the input and output tensors of the model, once exported to tensorRT to feed images to the model, and receive inference masks.
Also interested in knowing if they are NHWC or NCHW.
That means, it is needed to use TRT OSS repo to build a new /usr/lib/aarch64-linux-gnu/libnvinfer_plugin.so and replace.
Change the size_ht and size_wd under ‘Pad’ and the two values under crop_size. The first param under crop_size Is height. So for e.g., if you want to train on 720x1280 (hxw) resolution. This is how config will look like:
For img_scale, please set second param to the shortest input resolution. For the first param, you can set it to any value greater than your shortest input resolution upto 2048 . But it is advisable to use equal height and width to avoid hassle of setting values at multiple places.
More, the img_scale should be >= crop_size .
We will improve document in next release.
See Data Annotation Format
For the color/ rgb input images, each mask image is a single-channel or three-channel image with size equal to the input image. Every pixel in the mask should have an integer value that represents the segmentation class label_id
A ratio will be randomly sampled from the range specified by ratio_range. Then it would be multiplied with img_scale to generate sampled scale.
The img_scale contains the images scale base to multiply with ratio.
The ratio_range contains The minimum and maximum ratio to scale the img_scale.
For above example, the minimum ratio is 0.5, the maximum ratio is 2.0.
Then, the new height is are randomly set to the range from 1024x0.5 to 10242.0. The new width is randomly set to the range from 512x0.5 to 5122.0. So, the augmentation images’ resolution is (new_height, new_width) .
The validation config contains “multi_scale” for validation during training. The multi_scale is the largest scale of image.
Pad: It is the padding augmentation. size_ht (int): The height at which to pad the image/mask.
size_wd (int): The width at which pad the image/mask
pad_val (int): The padding value for the input image
seg_pad_val (int): The padding value for the segmentation