1- I’m training a classification model using TAO, I managed to get a good validation accuracy while training. But when I do the inference, the results are not good.

2- I used the same dataset to train a resnet18 on pytorch, and I got good results (onnx model). After that I used the same onnx model on deepstream, and the results are different.

• Hardware : DGX
• Network Type : Resnet18/Classification
• Training spec file : model_config {

Model Architecture can be chosen from:

[‘resnet’, ‘vgg’, ‘googlenet’, ‘alexnet’]

arch: “resnet”

for resnet → n_layers can be [10, 18, 50]

for vgg → n_layers can be [16, 19]

n_layers: 18
freeze_blocks: 0
freeze_blocks: 1
use_batch_norm: True
use_bias: False
all_projections: False
use_pooling: True
use_imagenet_head: False
resize_interpolation_method: BILINEAR

if you want to use the pretrained model,

image size should be “3,224,224”

otherwise, it can be “3, X, Y”, where X,Y >= 16

input_image_size: “3,224,224”
train_config {
train_dataset_path: “/raid/dataset/camera_state_classification/05-01-2023”
val_dataset_path: “/raid/dataset/camera_state_classification/test05-01”
#pretrained_model_path: “/workspace/tao_experiments/train/classification/camera_state/224-224/v0-pruned/resnet_015.tlt”
pretrained_model_path: “/workspace/tao_experiments/nvidia_pretrained/imagenet/resnet_18.hdf5”

Only [‘sgd’, ‘adam’] are supported for optimizer

optimizer {
sgd {
lr: 0.0001
decay: 0.00001

momentum: 0.9

nesterov: False


batch_size_per_gpu: 32
n_epochs: 30

Number of CPU cores for loading data

n_workers: 16


reg_config {
# regularizer type can be “L1”, “L2” or “None”.
type: “L2”
# if the type is not “None”,
# scope can be either “Conv2D” or “Dense” or both.
scope: “Conv2D,Dense”
# 0 < weight decay < 1
weight_decay: 0.000015


lr_config {
cosine {
learning_rate: 0.001
min_lr_ratio: 0.1
soft_start: 0.0
enable_random_crop: False
enable_center_crop: False
enable_color_augmentation: False
mixup_alpha: 0.0
label_smoothing: 0.1
preprocess_mode: “caffe”
image_mean {
key: ‘b’
value: 103.939
image_mean {
key: ‘g’
value: 116.779
image_mean {
key: ‘r’
value: 123.68
eval_config {
eval_dataset_path: “/raid/dataset/camera_state_classification/test05-01”
model_path: “/workspace/tao_experiments/train/classification/camera_state/05-01-2023/weights/resnet_025.tlt”
top_k: 1
batch_size: 256
n_workers: 8
enable_center_crop: True

