Errors in documentation for classification?

There appear to be inconsistencies in the documentation for training a classifier:

  1. The docs mention downloading the pre-trained model but the classifier training command doesn't allow for this pre-trained model to be used as input (as with the SSD model training's -m option) nor is it specified anywhere in the specification file. If this is really how things should work then how are we doing transfer learning? Is there perhaps an undocumented mechanism in place for loading the pre-trained model? If not then why is a pre-trained model even mentioned in the docs or provided on NGC if it can't be used as a training input?
  2. The example specification file for classifier training includes a field named "conf_threshold", but the parser doesn't expect this field:
    2019-12-17 04:01:27,761 [INFO] iva.makenet.scripts.train: Loading experiment spec at specs/classification_resnet_train.txt.
    2019-12-17 04:01:27,763 [INFO] iva.makenet.spec_handling.spec_loader: Merging specification from specs/classification_resnet_train.txt
    Traceback (most recent call last):
      File "/usr/local/bin/tlt-train-g1", line 8, in <module>
        sys.exit(main())
      File "./common/magnet_train.py", line 27, in main
      File "./makenet/scripts/train.py", line 410, in main
      File "./makenet/scripts/train.py", line 271, in run_experiment
      File "./makenet/spec_handling/spec_loader.py", line 75, in load_experiment_spec
      File "./makenet/spec_handling/spec_loader.py", line 51, in load_proto
      File "./makenet/spec_handling/spec_loader.py", line 35, in _load_from_file
      File "/usr/local/lib/python2.7/dist-packages/google/protobuf/text_format.py", line 693, in Merge
        allow_unknown_field=allow_unknown_field)
      File "/usr/local/lib/python2.7/dist-packages/google/protobuf/text_format.py", line 760, in MergeLines
        return parser.MergeLines(lines, message)
      File "/usr/local/lib/python2.7/dist-packages/google/protobuf/text_format.py", line 785, in MergeLines
        self._ParseOrMerge(lines, message)
      File "/usr/local/lib/python2.7/dist-packages/google/protobuf/text_format.py", line 807, in _ParseOrMerge
        self._MergeField(tokenizer, message)
      File "/usr/local/lib/python2.7/dist-packages/google/protobuf/text_format.py", line 932, in _MergeField
        merger(tokenizer, message, field)
      File "/usr/local/lib/python2.7/dist-packages/google/protobuf/text_format.py", line 1006, in _MergeMessageField
        self._MergeField(tokenizer, sub_message)
      File "/usr/local/lib/python2.7/dist-packages/google/protobuf/text_format.py", line 899, in _MergeField
        (message_descriptor.full_name, name))
    google.protobuf.text_format.ParseError: 28:3 : Message type "EvalConfig" has no field named "conf_threshold".
    

    When I remove the “conf_threshold” field this error goes away.

Hi monocongo,
The pre-trained model can be used as input. It is configured inside spec file. You can refer to it.
When you trigger docker, you can find the spec file under “examples/classification/specs/” as below.

...
train_config {
  train_dataset_path: "/workspace/tlt-experiments/data/split/train"
  val_dataset_path: "/workspace/tlt-experiments/data/split/val"
 <b> pretrained_model_path: "/workspace/tlt-experiments/pretrained_resnet18/tlt_resnet18_classification_v1/resnet18.hdf5"</b>
  optimizer: "sgd"
  batch_size_per_gpu: 64
  n_epochs: 80
  n_workers: 16
...

The example in tlt user guide does not mention the pre-trained model, which confused you.

Also, “conf_threshold” field should not be available in eval_config. I will sync with internal team to update the document.

Thanks for your help, Morgan. Going forward I will assume that the valid examples to use are those in the Docker container and to not trust the ones shown in the documentation, as it doesn’t appear that your development process includes a mechanism to keep these in sync. Good to know…