Hello,
We have trained our TAO Unet model and have compared it against the Pytorch model with similar parameters but it looks like both are populating different kinds of results, Where it looks that both might be having different initialisation points, Our PyTorch is starting from the completely random starting weights and for TAO Unet we haven’t come across any documentation stating any specific initialisation points example being any ImageNET weight or else,
So it would be helpful if someone can point us towards any starting weights that TAO-Unet is using at the Backend that we are not aware of, or any other suggested weights that can be used to attain similar-looking results.
Thanks in Advance !!!
• Hardware (T4/V100)
• Network Type (Unet)
• TLT Version
‘’’
Configuration of the TAO Toolkit Instance
dockers:
nvidia/tao/tao-toolkit-tf:
v3.21.11-tf1.15.5-py3:
docker_registry: nvcr.io
tasks:
1. augment
2. bpnet
v3.21.11-tf1.15.4-py3:
docker_registry: nvcr.io
tasks:
1. detectnet_v2
2. faster_rcnn
nvidia/tao/tao-toolkit-pyt:
v3.21.11-py3:
docker_registry: nvcr.io
tasks:
1. speech_to_text
2. speech_to_text_citrinet
3. text_classification
4. question_answering
5. token_classification
6. intent_slot_classification
7. punctuation_and_capitalization
8. spectro_gen
9. vocoder
10. action_recognition
nvidia/tao/tao-toolkit-lm:
v3.21.08-py3:
docker_registry: nvcr.io
tasks:
1. n_gram
format_version: 2.0
toolkit_version: 3.21.11
published_date: 11/08/2021
‘’’
• Training spec file
‘’’
random_seed: 42
dataset_config {
augment: true
dataset: "custom"
input_image_type: "color"
train_images_path: "train_aug"
train_masks_path: "trainannot_aug"
val_images_path: "val"
val_masks_path: "valannot"
test_images_path: "test"
data_class_config {
target_classes {
name: "background"
mapping_class: "background"
}
target_classes {
name: "***"
label_id: 1
mapping_class: "***"
}
}
augmentation_config {
spatial_augmentation {
hflip_probability: 0.5
vflip_probability: 0.5
}
brightness_augmentation {
delta: 0.20000000298023224
}
}
}
model_config {
num_layers: 18
training_precision {
backend_floatx: FLOAT32
}
arch: "resnet"
all_projections: true
model_input_height: 512
model_input_width: 512
model_input_channels: 3
}
training_config {
batch_size: 16
regularizer {
type: L2
weight: 1.9999999494757503e-05
}
optimizer {
adam {
epsilon: 9.99999993922529e-09
beta1: 0.8999999761581421
beta2: 0.9990000128746033
}
}
checkpoint_interval: 1
log_summary_steps: 1
learning_rate: 9.999999747378752e-05
loss: "cross_entropy"
epochs: 200
weights_monitor: true
}
‘’’