TLT, classification, How to set the number of classes for training the custom dataset

insight77 · January 18, 2021, 7:42am

I am trying to train with ImageNet 1k which has 1000 classes with resnet18 from NGC.
How can I change the number of output of it to 1000.

I got an error which the number of label is not matched.

  File "/usr/local/lib/python3.6/dist-packages/keras/engine/training.py", line 789, in _standardize_user_data
    exception_prefix='target')
  File "/usr/local/lib/python3.6/dist-packages/keras/engine/training_utils.py", line 138, in standardize_input_data
    str(data_shape))
ValueError: Error when checking target: expected predictions to have shape (176,) but got array with shape (1000,)

printed network is like the below

block_4c_relu (Activation)      (None, 2048, 14, 14) 0           add_16[0][0]
avg_pool (AveragePooling2D)     (None, 2048, 1, 1)   0           block_4c_relu[0][0]
flatten (Flatten)               (None, 2048)         0           avg_pool[0][0]
predictions (Dense)             (None, 176)          360624      flatten[0][0]

command

$ tlt-train classification -e i1k_r18_train.cfg -r /workspace/test1/ -k 1103

i1k_r18_train.cfg

model_config {
  arch: "resnet"
  n_layers: 18
  use_bias: True
  use_batch_norm: True
  all_projections: True
  use_pooling: False
  freeze_bn: False
  freeze_blocks: 0
  freeze_blocks: 1
  input_image_size: "3,224,224"
}
eval_config {
  eval_dataset_path: "/dataset/ILSVRC2012/val"
  model_path: "/workspace/test1"
  top_k: 3
  batch_size: 256
  n_workers: 8
}

train_config {
  train_dataset_path: "/dataset/ILSVRC2012/train"
  val_dataset_path: "/dataset/ILSVRC2012/val"
  pretrained_model_path: "/workspace/tlt_pretrained_classification_vresnet18/resnet_18.hdf5"

  optimizer: "sgd"
  batch_size_per_gpu: 64
  n_epochs: 3
  n_workers: 16

  # regularizer
  reg_config {
    type: "L2"
    scope: "Conv2D,Dense"
    weight_decay: 0.00005
  }

  lr_config {
    scheduler: "soft_anneal"
    learning_rate: 0.005
    soft_start: 0.056
    annealing_points: "0.3, 0.6, 0.8"
    annealing_divider: 10
  }
}

Morganh · January 18, 2021, 9:30am

Can you make sure your dataset contains 1000 folder?

insight77 · January 19, 2021, 12:50am

According to the log of tlt-train, there are 1000 classes in the directory.

2021-01-18 21:25:15,116 [INFO] iva.makenet.scripts.train: Loading experiment spec at i1k_r18_train.cfg.
Found 1281167 images belonging to 1000 classes.
2021-01-18 21:26:56,969 [INFO] iva.makenet.scripts.train: Processing dataset (train): /dataset/ILSVRC2012/train
Found 50000 images belonging to 1000 classes.
2021-01-18 21:27:00,119 [INFO] iva.makenet.scripts.train: Processing dataset (validation): /dataset/ILSVRC2012/val

Morganh · January 19, 2021, 1:40am

Please check your dataset again.
For example, if your dataset contains 256x256x3 image, the log is similar to below.

Layer (type) Output Shape Param # Connected to

input_1 (InputLayer) (None, 3, 256, 256) 0

conv1 (Conv2D) (None, 64, 128, 128) 9408 input_1[0][0]

bn_conv1 (BatchNormalization) (None, 64, 128, 128) 256 conv1[0][0]

activation_1 (Activation) (None, 64, 128, 128) 0 bn_conv1[0][0]

block_1a_conv_1 (Conv2D) (None, 64, 64, 64) 36864 activation_1[0][0]

…

activation_17 (Activation) (None, 512, 16, 16) 0 add_8[0][0]

avg_pool (AveragePooling2D) (None, 512, 1, 1) 0 activation_17[0][0]

flatten (Flatten) (None, 512) 0 avg_pool[0][0]

predictions (Dense) (None, 1000) 513000 flatten[0][0]

Total params: 12,055,464
Trainable params: 12,043,816
Non-trainable params: 11,648

The number of classes depends on your training folders quantity.

Morganh · January 25, 2021, 3:51am

Please remove pretrained_model_path: "/workspace/tlt_pretrained_classification_vresnet18/resnet_18.hdf5" in your spec.

Topic		Replies	Views
Training Classification model from scratch TAO Toolkit	6	782	October 12, 2021
Use an old .tlt model to retrain it with a new dataset TAO Toolkit training	7	1148	January 25, 2022
Tlt-train classification error TAO Toolkit	7	632	October 12, 2021
Error: Transfer learning toolkit for classification failed to setting image size TAO Toolkit	9	924	October 12, 2021
Training Custom Object detector with 6 classes TAO Toolkit	27	2286	October 12, 2021
Loss function in classifier training causes a ValueError TAO Toolkit	5	1506	October 12, 2021
TLT for custom dataset training TAO Toolkit	4	534	October 12, 2021
dataset needed for transfer learning TAO Toolkit	5	627	October 12, 2021
Export Classifier to DeepStream TAO Toolkit	7	851	October 12, 2021
TLT Classification example loss and val_acc unable to converge during training TAO Toolkit nvbugs	12	701	October 12, 2021

TLT, classification, How to set the number of classes for training the custom dataset

command

i1k_r18_train.cfg

Related topics