Errors in documentation for classification?

There appear to be inconsistencies in the documentation for training a classifier:

  1. The docs mention downloading the pre-trained model but the classifier training command doesn't allow for this pre-trained model to be used as input (as with the SSD model training's -m option) nor is it specified anywhere in the specification file. If this is really how things should work then how are we doing transfer learning? Is there perhaps an undocumented mechanism in place for loading the pre-trained model? If not then why is a pre-trained model even mentioned in the docs or provided on NGC if it can't be used as a training input?
  2. The example specification file for classifier training includes a field named "conf_threshold", but the parser doesn't expect this field:
    2019-12-17 04:01:27,761 [INFO] iva.makenet.scripts.train: Loading experiment spec at specs/classification_resnet_train.txt.
    2019-12-17 04:01:27,763 [INFO] iva.makenet.spec_handling.spec_loader: Merging specification from specs/classification_resnet_train.txt
    google.protobuf.text_format.ParseError: 28:3 : Message type "EvalConfig" has no field named "conf_threshold".

    When I remove the “conf_threshold” field this error goes away.

Hi monocongo,
The pre-trained model can be used as input. It is configured inside spec file. You can refer to it.
When you trigger docker, you can find the spec file under “examples/classification/specs/” as below.

train_config {
  train_dataset_path: "/workspace/tlt-experiments/data/split/train"
  val_dataset_path: "/workspace/tlt-experiments/data/split/val"
 <b> pretrained_model_path: "/workspace/tlt-experiments/pretrained_resnet18/tlt_resnet18_classification_v1/resnet18.hdf5"</b>
  optimizer: "sgd"
  batch_size_per_gpu: 64
  n_epochs: 80
  n_workers: 16

The example in tlt user guide does not mention the pre-trained model, which confused you.

Also, “conf_threshold” field should not be available in eval_config. I will sync with internal team to update the document.

Thanks for your help, Morgan. Going forward I will assume that the valid examples to use are those in the Docker container and to not trust the ones shown in the documentation, as it doesn’t appear that your development process includes a mechanism to keep these in sync. Good to know…