Resnet18 Object Detection Image Resolution Problem

NitinRai · November 21, 2019, 11:05am

I first tried to train a resnet18 model with publicly available kitti dataset, to get an idea of how the whole TLT works. Though i had to write small scripts to resize all the images to same resolution along side rescaling their bounding boxes accordingly otherwise the training was throwing errors after few epochs.
I successfully trained the model on DGX-1 and to deploying it to Agx-Xavier.
The training went smoothly as all the images have similar resolution so it was very easy to resize them to an uniform resolution in multiples of 16.

Then comes the real part, now i have to train a model on custom dataset (all the required labels in kitti format are already prepared).
Now i also have good experience on building all the required spec files to train a model successfully.

But here’s the issue, the resolutions of the images have very large differences between them i.e., with different aspect ratio.
I can’t simply resize all the images to same resolution as it will destroy the sensitive parts of the images.

I am sharing all the analytics that i have performed over their resolutions so far:

stats---size(kb)-----aspect_ratio-----height-----width
mean     89.12        1.205335        323.0      224.0
std      85.11        0.604291        187.0      226.0
min      3.80         0.000000        49.0       27.0
25%      28.42        1.000000        175.0      99.0
50%      61.94        1.000000        272.0      154.0
75%      118.75       2.000000        427.0      247.0
max      823.51       6.000000        1122.0     1920.0

Histogram: height vs. width
https://66.media.tumblr.com/d07be29c4bf5ff6778aa6ea2ccf59a18/299825b9a9864a17-f1/s2048x3072/0832a6ef06cfc8b125e218b8fabbc39b6ad2328f.png

Can anyone please suggest a good way to process these images for the training. Such that it don’t destroy any sensitive information.

Morganh · November 22, 2019, 3:48am

The training image size should be closer to input image size. It is better to make training as close as actual.
Also customer could think to use padding method.
Example:
480x480 and 640x640, then pad with zero data to 480x480 and make it 640x640. Make sure keep aspect ratio during padding.

monita.ramirezb · March 18, 2021, 7:48pm

Hi,
I’m also interested in the size of images to train a model, I hope you can read this and give me some advice:
According to Python-apps test3, input data from a rtsp camera is set at 1920x1080. To train a model and run it on this input, should I train on images that big? Wouldn’t that be too much to process?
Or could I train on smaller dimensions with the same height/width relation? Or can the input stream size be reshaped before feeding it to the model?

Morganh · March 18, 2021, 11:22pm

No, in TLT, you can train a model with a different dimensions, then run inference in deepstream against any size of mp4.

monita.ramirezb · March 19, 2021, 4:22pm

Great!
Now, for training ALL images must have the same dimensions, even if, like you say, they’re different than the input stream. Am I right?

Morganh · March 19, 2021, 11:16pm

For training image,please see the requirement of each network. For example,ssd can accept training images with any resolution. But detectnet_v2 and frcnn cannot.
For inference, all the networks accept images with any resolution.

Topic		Replies	Views
Network Image Input Resizing TAO Toolkit	7	837	October 12, 2021
Image size- DetectNet_v2 TAO Toolkit tao , inception	4	855	January 21, 2023
Tlt-infer - input image TAO Toolkit	3	607	October 12, 2021
Finding inaccurate result while testing model(TLT trained model) with deepstream TAO Toolkit	14	1025	October 12, 2021
Invalid loss, terminating training TAO Toolkit	5	669	October 12, 2021
Smaller model issues TAO Toolkit	4	281	June 16, 2023
TLT detectnet_v2 set training width and height TAO Toolkit	16	849	October 12, 2021
Expert Advice Regarding Poor result after training classification model using TLT on resnet18 Deep Learning (Training & Inference)	1	280	November 9, 2020
Very low precision while Training detectnet_v2 model using custom data in TAO TAO Toolkit	13	994	May 4, 2023
The engine trained and deployed using TLT runs incorrectly in Deepstream TAO Toolkit	13	558	October 12, 2021

Resnet18 Object Detection Image Resolution Problem

Related topics