I try to use the Transfer Learning Toolkit-3.0 to train the Unet model. But I got error:
NameError: name ‘unet’ is not defined
Can you please advise how to fix this problem?
Thanks
Jimmy
- the TLT-3.0 docker image is use to run first
docker run --runtime=nvidia -it -v /home/jiande/workspace:/workspace nvcr.io/nvidia/tlt-streamanalytics:v3.0-dp-py3
-
here is the spec.txt file I used
checkpoint: “resnet101/resnet_101.hdf5” - Here I have downloaded the Unet model fro NGC
model_config {
num_layers: 18
all_projections: true
arch: “resnet”
freeze_blocks: 0
freeze_blocks: 1
use_batch_norm: true
training_precision {
backend_floatx: FLOAT32
}
model_input_height: 320
model_input_width: 320
model_input_channels: 3
}
raining_config {
batch_size: 2
epochs: 3
log_summary_steps: 10
checkpoint_interval: 1
loss: “cross_dice_sum”
learning_rate:0.0001
regularizer {
type: L2
weight: 3.00000002618e-09
}
optimizer {
adam {
epsilon: 9.99999993923e-09
beta1: 0.899999976158
beta2: 0.999000012875
}
}
}
data_config{
image_size: “(832, 1344)”
augment_input_data: True
eval_samples: 5000
training_file_pattern: "
$DATA_DIR/train*.tfrecord"
validation_file_pattern: “$DATA_DIR/val*.tfrecord”
val_json_file: “$DATA_DIR/annotations/instances_val2017.json”
num_classes: 91
skip_crowd_during_training: True
} -
command to traing
tlt-train unet -e spec.txt -d ./workspace -k key --gpus 1