Poor result on resnet10 and resnet18 on TLT-V2 while using the model in Deepstream

Pritam · November 23, 2020, 9:38am

Hi Team,

I have trained my model on age classification(on person b-box not face) using resnet10 on tlt-v2.
Classes : age_0-15, age_16-35, age_36-55, age_55+
My result on my test data is

Train Resnet10 Epoch 102
may take a while…
Confusion Matrix
[[248 5 1 0]
[ 6 207 37 4]
[ 1 15 235 3]
[ 0 0 0 254]]
Classification Report
precision recall f1-score support

age_0-15       0.97      0.98      0.97       254

age_16-35 0.91 0.81 0.86 254
age_36-55 0.86 0.93 0.89 254
age_55+ 0.97 1.00 0.99 254

micro avg 0.93 0.93 0.93 1016
macro avg 0.93 0.93 0.93 1016
weighted avg 0.93 0.93 0.93 1016

Retrain Resnet10 Epoch 143 :
Confusion Matrix
[[250 3 1 0]
[ 6 204 41 3]
[ 5 20 227 2]
[ 0 0 0 254]]
Classification Report
precision recall f1-score support

age_0-15       0.96      0.98      0.97       254

age_16-35 0.90 0.80 0.85 254
age_36-55 0.84 0.89 0.87 254
age_55+ 0.98 1.00 0.99 254

micro avg 0.92 0.92 0.92 1016
macro avg 0.92 0.92 0.92 1016
weighted avg 0.92 0.92 0.92 1016

But when I use these model and test it live in deep-stream I am getting maximum time age_55+ class even the person belong to age_0-15 or age_16-35. Can you please suggest where I am wrong.

My training_config files :

model_config {
** arch: “resnet”,**
** n_layers: 10**
** # Setting these parameters to true to match the template downloaded from NGC.**
** use_batch_norm: true**
** all_projections: true**
** freeze_blocks: 0**
** freeze_blocks: 1**
** input_image_size: “3,400,200”**
}
train_config {
** train_dataset_path: “/workspace/tlt-experiments/data/split/train”**
** val_dataset_path: “/workspace/tlt-experiments/data/split/val”**
** pretrained_model_path: “/workspace/tlt-experiments/classification/pretrained_resnet10/tlt_pretrained_classification_vresnet10/resnet_10.hdf5”**
** optimizer: “sgd”**
** batch_size_per_gpu: 64**
** n_epochs: 300**
** n_workers: 16**

** # regularizer**
** reg_config {**
** type: “L2”**
** scope: “Conv2D,Dense”**
** weight_decay: 0.00005**
** }**

** # learning_rate**
** lr_config {**
** scheduler: “step”**
** learning_rate: 0.006**
** #soft_start: 0.056**
** #annealing_points: “0.3, 0.6, 0.8”**
** #annealing_divider: 10**
** step_size: 10**
** gamma: 0.1**
** }**
}
eval_config {
** eval_dataset_path: “/workspace/tlt-experiments/data/split/test”**
** model_path: “/workspace/tlt-experiments/classification/output/weights/resnet_102.tlt”**
** top_k: 3**
** batch_size: 256**
** n_workers: 8**
}

I also went with resnet18 but getting same issue.

Please help.
Thanks.

Morganh · November 24, 2020, 2:57am

Please run tlt-infer to check its result firstly.

Pritam · November 24, 2020, 4:35am

Hi Morganh,

I have also tested the result using tlt-infere on complete test data.
Each class contains 254 image.
age_0-15 - > 19 wrong.
age_16-35 - > 26 wrong
age_36_55 → 106 wrong
age_55+ - > 17 wrong.

but running with Deep stream giving biased result towards class age_55+.

Please suggest what should we do.

Thanks.

Morganh · November 24, 2020, 4:56am

So, do you mean the result of tlt-infer is expected but deepstream’s result is not expected?

Pritam · November 24, 2020, 5:01am

Yes but there also need improvement
age_0-15 - > 19 wrong.
age_16-35 - > 26 wrong
age_36_55 → 106 wrong
age_55+ - > 17 wrong.
have to reach 0-1 wrong in each class.,

I am facing model biased toward single class more with resnet10 and resnet18. with resnet50 I was not getting biased result.
with all the three model training data was same and configuration parameter were also the same except number of layers.
Can you please let me know where are the gaps. ?

Morganh · November 24, 2020, 6:05am

You mentioned that your tlt-infer result with resnet10 and resnet18 is not good, what is the training accuracy result during the log? If not meet your requirement, you need to trigger more experiments to get a higher training result. Try to finetune hyper-parameters, add dataset, etc. It is a topic of training.

You also mention that your resnet50 tlt-infer result is good. What is its training accuracy result during the log? Can deepstream run well with this resnet50 model? If there is a gap between deepstream and tlt-infer, you need to check deepstream config firstly.

Pritam · November 24, 2020, 6:55am

Training logs :
Resnet10:

118,0.99943741209564,0.12332773998735994,0.9212598425196851,0.39120948261867355
119,0.99943741209564,0.12247711290119402,0.9114173228346457,0.39215778225050196
120,0.99943741209564,0.12329318260593253,0.9133858267716536,0.39798230190915385
121,0.99915611814346,0.12298165021985559,0.9153543307086615,0.3944643869644075
122,0.9988748241912799,0.12335022698810165,0.9173228346456693,0.3978482203807418
123,0.9977496483825598,0.12459035501324174,0.9153543307086615,0.39405789537223307
124,0.99971870604782,0.12321298180250176,0.9212598425196851,0.3877223948324759

Resnet18:
115,0.9930656931696147,0.16175098495326773,0.875,0.6687979786371698
116,0.9930656931696147,0.16010095211711242,0.8979591836734694,0.6186932793685368
117,0.9948905109489051,0.15361222642181563,0.8647959183673469,0.7049501638631431
118,0.9945255474452555,0.15107576564280656,0.8903061224489796,0.6329286779676165
119,0.9945255474452555,0.15169603078469743,0.9005102040816326,0.5941275841727549
120,0.9934306566732644,0.15416788234762901,0.8877551020408163,0.6722254655799087
121,0.994525546923171,0.1563258668584545,0.8852040816326531,0.7187542656854707
122,0.9908759124087592,0.15480450430925746,0.8877551020408163,0.6677013991438613

Okay will fine tune the hyper params and will check DS configuration as well.

Topic		Replies	Views
Poor result on resnet10 and resnet18 on TLT-V2 while using the model in Deepstream TensorRT	2	392	November 23, 2020
For same frame I get different output using .tlt and .engine TAO Toolkit	24	1766	October 12, 2021
Inferring resnet18 classification etlt model with python TAO Toolkit	45	4289	October 12, 2021
TLT ResNet18 classifier don't get proper predictions in DS DeepStream SDK tensorrt	5	737	October 12, 2021
Deepstream with tlt resnet50 model giving unknown warning DeepStream SDK	11	767	October 12, 2021
Issue with image classification tutorial and testing with deepstream-app TAO Toolkit tensorrt , jetson-inference	34	6042	October 12, 2021
Tlt resnet18 performance drop between .tlt inference and .engine TAO Toolkit	25	2251	October 4, 2021
Nvidia TLT TAO Toolkit	15	1728	October 12, 2021
How to build resnet10 equivalent with TLT for deepstream TAO Toolkit	12	1566	October 12, 2021
Finding inaccurate result while testing model(TLT trained model) with deepstream TAO Toolkit	14	1159	October 12, 2021

Poor result on resnet10 and resnet18 on TLT-V2 while using the model in Deepstream

Related topics