Please provide the following information when requesting support.
• Hardware (RTX 3080)
• Network Type (Classification)
• TLT Version (Configuration of the TAO Toolkit Instance
dockers: [‘nvidia/tao/tao-toolkit-tf’, ‘nvidia/tao/tao-toolkit-pyt’, ‘nvidia/tao/tao-toolkit-lm’]
format_version: 2.0
toolkit_version: 3.22.05
published_date: 05/25/2022
)
• Training spec file(model_config {
Model Architecture can be chosen from:
[‘resnet’, ‘vgg’, ‘googlenet’, ‘alexnet’]
arch: “resnet”
for resnet → n_layers can be [10, 18, 50]
n_layers: 50
use_batch_norm: True
use_bias: False
all_projections: False
use_pooling: True
retain_head: False
resize_interpolation_method: BILINEAR
image size can be “3, X, Y”, where X,Y >= 16
input_image_size: “3,2000,700”
}
train_config {
train_dataset_path: “/home/peter/TAO_toolkit/data/train”
val_dataset_path: “/home/peter/TAO_toolkit/data/val”
pretrained_model_path: “/home/peter/TAO_toolkit/models” # resnet_50 not in the right format yet
Only [‘sgd’, ‘adam’] are supported for optimizer
optimizer {
adam {
lr: 0.01
decay: 0.0
momentum: 0.9
nesterov: False
}
}
batch_size_per_gpu: 10
n_epochs: 150
Number of CPU cores for loading data
n_workers: 1
regularizer
reg_config {
# regularizer type can be “L1”, “L2” or “None”.
type: “L2”
# if the type is not “None”,
# scope can be either “Conv2D” or “Dense” or both.
scope: “Conv2D,Dense”
# 0 < weight decay < 1
weight_decay: 0.000015
}
learning_rate
lr_config {
cosine {
learning_rate: 0.03
soft_start: 0.0
}
}
enable_random_crop: True
enable_center_crop: True
enable_color_augmentation: False
mixup_alpha: 0.1
label_smoothing: 0.1
preprocess_mode: “torch”
image_mean {
key: ‘b’
value:
}
image_mean {
key: ‘g’
value:
}
image_mean {
key: ‘r’
value:
}
}
eval_config {
eval_dataset_path: “/home/peter/TAO_toolkit/data/test”
model_path: “/workspace/tao-experiments/classification/weights/resnet_50.tao”
top_k: 1
batch_size: 20
n_workers: 1
enable_center_crop: True
)
• How to reproduce the issue ? (This is for errors. Please share the command line and the detailed log here.)
Hi there.
I have seen the earlier 'tlt-prune: error and the suggested solution was to place "enc_key: ‘abcdef’ in the spec file, where ‘abcdef’ is my ngc API key. This is not mentioned in:
Image Classification — TAO Toolkit 3.22.05 documentation.
Where should this line be placed in the Training spec file(model_config file?