Segmentation with unet : AssertionError: Freeze blocks is only possible if a pretrained model file is provided

Hello.
I’m trying semantic segmentation with tlt v3 on custom dataset.
I use a resnet18 backbone, however when launching training i got the following error :

  File "/home/vpraveen/.cache/dazel/_dazel_vpraveen/216c8b41e526c3295d3b802489ac2034/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/unet/scripts/train.py", line 403, in <module>
  File "/home/vpraveen/.cache/dazel/_dazel_vpraveen/216c8b41e526c3295d3b802489ac2034/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/unet/scripts/train.py", line 397, in main
  File "/home/vpraveen/.cache/dazel/_dazel_vpraveen/216c8b41e526c3295d3b802489ac2034/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/unet/scripts/train.py", line 298, in run_experiment
  File "/home/vpraveen/.cache/dazel/_dazel_vpraveen/216c8b41e526c3295d3b802489ac2034/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/unet/scripts/train.py", line 133, in train_unet
  File "/home/vpraveen/.cache/dazel/_dazel_vpraveen/216c8b41e526c3295d3b802489ac2034/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/unet/model/build_unet_model.py", line 75, in build_model
AssertionError: Freeze blocks is only possible if a pretrained model file is provided.

I launch training as follows :

tlt unet train --gpus=2 \
  -e /workspace/tlt-experiments/specs/resnet18.txt \
  -r /output/runs/resnet18_run1 \
  -m /output/pretrained_resnet18/tlt_semantic_segmentation_vresnet18/resnet_18.hdf5 \
  -n resnet18_lip \
  -k $KEY

here is my spec file if it can help, I put a freeze_blocks field but I also provided a pretrained model path via the training command above, so I don’t understand why it is saying freezing is only possible for pretrained model provided :

random_seed: 42

model_config {
   model_input_width: 224
   model_input_height: 224
   model_input_channels: 3
   num_layers: 18
   all_projections: true
   arch: "resnet"
   use_batch_norm: true
   freeze_blocks: 0
   freeze_blocks: 1
   training_precision {
   backend_floatx: FLOAT32
     }
}

training_config {
batch_size: 64
 epochs: 1
 log_summary_steps: 10
 checkpoint_interval: 1
 loss: "cross_entropy"
 learning_rate:0.0001
 regularizer {
   type: L2
   weight: 3.00000002618e-09
 }
 optimizer {
   adam {
     epsilon: 9.99999993923e-09
     beta1: 0.899999976158
     beta2: 0.999000012875
   }
 }
}


dataset_config {
   dataset: "custom"
   augment: False
   input_image_type: "color"
   train_images_path: "/data1/TrainVal_images/TrainVal_images/train_images/"
   train_masks_path: "/data1/TrainVal_parsing_annotations/TrainVal_simplified_annotations/train_segmentations/"

   val_images_path: "/data1/TrainVal_images/TrainVal_images/val_images/"
   val_masks_path: "/data1/TrainVal_parsing_annotations/TrainVal_simplified_annotations/val_segmentations/"

   test_images_path: "/data1/Testing_images/testing_images/"

   data_class_config {
        target_classes {
           label_id: 0
           name: 'Background'
           mapping_class: 'Background'
       }
       target_classes {
           label_id: 1
           name: 'Hat'
           mapping_class: 'Hat'
       }
       target_classes {
           label_id: 2
           name: 'Hair'
           mapping_class: 'Hair'
       }
       target_classes {
           label_id: 3
           name: 'Glove'
           mapping_class: 'Glove'
       }
       target_classes {
           label_id: 4
           name: 'Sunglasses'
           mapping_class: 'Sunglasses'
       }
       target_classes {
           label_id: 5
           name: 'UpperClothes'
           mapping_class: 'UpperClothes'
       }
       target_classes {
           label_id: 6
           name: 'Dress'
           mapping_class: 'Dress'
       }
       target_classes {
           label_id: 7
           name: 'Coat'
           mapping_class: 'Coat'
       }
       target_classes {
           label_id: 8
           name: 'Socks'
           mapping_class: 'Socks'
       }
       target_classes {
           label_id: 9
           name: 'Pants'
           mapping_class: 'Pants'
       }
       target_classes {
           label_id: 10
           name: 'Jumpsuits'
           mapping_class: 'Jumpsuits'
       }
       target_classes {
           label_id: 11
           name: 'Scarf'
           mapping_class: 'Scarf'
       }
       target_classes {
           label_id: 12
           name: 'Skirt'
           mapping_class: 'Skirt'
       }
       target_classes {
           label_id: 13
           name: 'Face'
           mapping_class: 'Face'
       }
       target_classes {
           label_id: 14
           name: 'Left-arm'
           mapping_class: 'Left-arm'
       }
       target_classes {
           label_id: 15
           name: 'Right-arm'
           mapping_class: 'Right-arm'
       }
       target_classes {
           label_id: 16
           name: 'Left-leg'
           mapping_class: 'Left-leg'
       }
       target_classes {
           label_id: 17
           name: 'Right-leg'
           mapping_class: 'Right-leg'
       }
       target_classes {
           label_id: 18
           name: 'Left-shoe'
           mapping_class: 'Left-shoe'
       }
       target_classes {
           label_id: 19
           name: 'Right-shoe'
           mapping_class: 'Right-shoe'
       }

   }
}

The file /output/pretrained_resnet18/tlt_semantic_segmentation_vresnet18/resnet_18.hdf5 should be the path inside the docker. Can you run following command to check if it is available?

$ tlt unet run ls /output/pretrained_resnet18/tlt_semantic_segmentation_vresnet18/resnet_18.hdf5

Thanks for the reply.
Yes I understand that, and I did the correct maping via the .tlt_mounts.json file.

When I run the command tlt unet run ls -lsht /output/pretrained_resnet18/tlt_semantic_segmentation_vresnet18/resnet_18.hdf5 this is what I get :

89M -rw-------. 1 16234 1638 89M Apr 23 11:54 /output/pretrained_resnet18/tlt_semantic_segmentation_vresnet18/resnet_18.hdf5
2021-04-23 15:26:51,858 [INFO] tlt.components.docker_handler.docker_handler: Stopping container.

So it seems the container is able to access the file correctly

I also noticed that in case of classification tutorial, the pretrained model path is provided via spec file under the field pretrained_model_path, So I tried also this for my segmentation case, but I got an error saying that this field is not recognized

If freeze blocks, please set pretrained model file in the training spec, for example,

freeze_blocks: 0
freeze_blocks: 1
pretrained_model_file: “/workspace/tlt-experiments/unet/pretrained_resnet18/tlt_semantic_segmentation_vresnet18/resnet_18.hdf5”

And then remove "-m /workspace/tlt-experiments/unet/pretrained_resnet18/tlt_semantic_segmentation_vresnet18/resnet_18.hdf5 " in the commandline.

Yes, adding your suggested changes overcome the issue (even though I finally tried the changes with the tlt provided container nvcr.io/nvidia/tlt-streamanalytics:v3.0-dp-py3, and the command is slightmy different: unet train instead of tlt unet train).

Should I understand that the doc is not up-to-date (or I’m not pointing to the right doc) ? because there it is said nowhere that we need to add pretrained_model_file in spec file instead of through CLI.

Thanks again for your help

Thanks for the info. I will sync with internal team for the doc.