TLT Classification example loss and val_acc unable to converge during training

Please provide the following information when requesting support.

• Hardware (ubuntu18.04+rtx3060)
• Network Type (Classification)
• TLT Version (Please run “tlt info --verbose” and share “docker_tag” here)

Configuration of the TLT Instance
dockers:
nvidia/tlt-streamanalytics:
docker_registry: nvcr.io
docker_tag: v3.0-py3
tasks:
1. augment
2. bpnet
3. classification
4. detectnet_v2
5. dssd
6. emotionnet
7. faster_rcnn
8. fpenet
9. gazenet
10. gesturenet
11. heartratenet
12. lprnet
13. mask_rcnn
14. multitask_classification
15. retinanet
16. ssd
17. unet
18. yolo_v3
19. yolo_v4
20. tlt-converter
nvidia/tlt-pytorch:
docker_registry: nvcr.io
docker_tag: v3.0-py3
tasks:
1. speech_to_text
2. speech_to_text_citrinet
3. text_classification
4. question_answering
5. token_classification
6. intent_slot_classification
7. punctuation_and_capitalization
format_version: 1.0
tlt_version: 3.0
published_date: 04/16/2021

• Training spec file(If have, please share here)
model_config {
arch: “resnet”,
n_layers: 18
use_batch_norm: true
all_projections: true
freeze_blocks: 0
freeze_blocks: 1
input_image_size: “3,224,224”
}
train_config {
train_dataset_path: “/workspace/tlt-experiments/data/split/train”
val_dataset_path: “/workspace/tlt-experiments/data/split/val”
pretrained_model_path: “/workspace/tlt-experiments/classification/pretrained_resnet18/tlt_pretrained_classification_vresnet18/resnet_18.hdf5”
optimizer {
sgd {
lr: 0.01
decay: 0.0
momentum: 0.9
nesterov: False
}
}
batch_size_per_gpu: 64
n_epochs: 80
n_workers: 16
preprocess_mode: “caffe”
enable_random_crop: True
enable_center_crop: True
label_smoothing: 0.0
mixup_alpha: 0.1

regularizer

reg_config {
type: “L2”
scope: “Conv2D,Dense”
weight_decay: 0.00005
}

learning_rate

lr_config {
step {
learning_rate: 0.006
step_size: 10
gamma: 0.1
}
}
}
eval_config {
eval_dataset_path: “/workspace/tlt-experiments/data/split/test”
model_path: “/workspace/tlt-experiments/classification/output/weights/resnet_080.tlt”
top_k: 3
batch_size: 256
n_workers: 8
enable_center_crop: True
}
• How to reproduce the issue ? (This is for errors. Please share the command line and the detailed log here.)
Unable to converge during training,Loss and ACC exception,the training results are as follows:
Epoch 2/80
183/183 [==============================] - 40s 216ms/step - loss: 1.8445 - acc: 0.5090 - val_loss: 1.6074 - val_acc: 0.5347
Epoch 3/80
183/183 [==============================] - 40s 217ms/step - loss: 1.7428 - acc: 0.5356 - val_loss: 1.5872 - val_acc: 0.5395
Epoch 4/80
183/183 [==============================] - 39s 212ms/step - loss: 1.7061 - acc: 0.5494 - val_loss: 1.5709 - val_acc: 0.5407
Epoch 5/80
183/183 [==============================] - 39s 212ms/step - loss: 1.6480 - acc: 0.5672 - val_loss: 1.5713 - val_acc: 0.5371
Epoch 6/80
183/183 [==============================] - 39s 212ms/step - loss: 1.6354 - acc: 0.5654 - val_loss: 1.5768 - val_acc: 0.5437
Epoch 7/80
183/183 [==============================] - 39s 212ms/step - loss: 1.6221 - acc: 0.5700 - val_loss: 1.5685 - val_acc: 0.5467
Epoch 8/80

183/183 [==============================] - 39s 215ms/step - loss: 1.5963 - acc: 0.5726 - val_loss: 1.5776 - val_acc: 0.5389
Epoch 9/80
183/183 [==============================] - 39s 214ms/step - loss: 1.5900 - acc: 0.5823 - val_loss: 1.5743 - val_acc: 0.5431
Epoch 10/80
183/183 [==============================] - 39s 215ms/step - loss: 1.5759 - acc: 0.5778 - val_loss: 1.5734 - val_acc: 0.5365
Epoch 11/80
183/183 [==============================] - 39s 213ms/step - loss: 1.5728 - acc: 0.5849 - val_loss: 1.5721 - val_acc: 0.5341
Epoch 12/80
183/183 [==============================] - 39s 215ms/step - loss: 1.5790 - acc: 0.5814 - val_loss: 1.5749 - val_acc: 0.5431
Epoch 13/80
183/183 [==============================] - 39s 215ms/step - loss: 1.5764 - acc: 0.5892 - val_loss: 1.5766 - val_acc: 0.5383
Epoch 14/80
183/183 [==============================] - 39s 214ms/step - loss: 1.5678 - acc: 0.5870 - val_loss: 1.5762 - val_acc: 0.5353
Epoch 15/80
183/183 [==============================] - 39s 214ms/step - loss: 1.5642 - acc: 0.5874 - val_loss: 1.5730 - val_acc: 0.5347
Epoch 16/80
183/183 [==============================] - 39s 214ms/step - loss: 1.5638 - acc: 0.5902 - val_loss: 1.5749 - val_acc: 0.5359
Epoch 17/80
183/183 [==============================] - 39s 214ms/step - loss: 1.5664 - acc: 0.5857 - val_loss: 1.5765 - val_acc: 0.5353
Epoch 18/80
183/183 [==============================] - 39s 215ms/step - loss: 1.5496 - acc: 0.5929 - val_loss: 1.5762 - val_acc: 0.5359
Epoch 19/80
183/183 [==============================] - 39s 214ms/step - loss: 1.5704 - acc: 0.5871 - val_loss: 1.5743 - val_acc: 0.5317
Epoch 20/80
183/183 [==============================] - 39s 214ms/step - loss: 1.5602 - acc: 0.5912 - val_loss: 1.5757 - val_acc: 0.5329
Epoch 21/80
183/183 [==============================] - 39s 215ms/step - loss: 1.5560 - acc: 0.5916 - val_loss: 1.5763 - val_acc: 0.5389
Epoch 22/80
183/183 [==============================] - 39s 214ms/step - loss: 1.5667 - acc: 0.5927 - val_loss: 1.5742 - val_acc: 0.5341
Epoch 23/80
183/183 [==============================] - 40s 219ms/step - loss: 1.5598 - acc: 0.5909 - val_loss: 1.5750 - val_acc: 0.5323
Epoch 24/80
183/183 [==============================] - 39s 216ms/step - loss: 1.5546 - acc: 0.5902 - val_loss: 1.5753 - val_acc: 0.5317
Epoch 25/80
183/183 [==============================] - 40s 220ms/step - loss: 1.5599 - acc: 0.5893 - val_loss: 1.5774 - val_acc: 0.5341
Epoch 26/80
183/183 [==============================] - 39s 215ms/step - loss: 1.5607 - acc: 0.5892 - val_loss: 1.5758 - val_acc: 0.5395
Epoch 27/80
183/183 [==============================] - 39s 214ms/step - loss: 1.5537 - acc: 0.5895 - val_loss: 1.5790 - val_acc: 0.5353
Epoch 28/80
183/183 [==============================] - 39s 213ms/step - loss: 1.5557 - acc: 0.5922 - val_loss: 1.5760 - val_acc: 0.5365
Epoch 29/80
183/183 [==============================] - 40s 219ms/step - loss: 1.5594 - acc: 0.5914 - val_loss: 1.5745 - val_acc: 0.5359
Epoch 30/80
183/183 [==============================] - 40s 221ms/step - loss: 1.5574 - acc: 0.5864 - val_loss: 1.5744 - val_acc: 0.5365
Epoch 31/80
183/183 [==============================] - 40s 218ms/step - loss: 1.5599 - acc: 0.5903 - val_loss: 1.5778 - val_acc: 0.5353
Epoch 32/80
183/183 [==============================] - 39s 215ms/step - loss: 1.5500 - acc: 0.5898 - val_loss: 1.5735 - val_acc: 0.5329
Epoch 33/80
183/183 [==============================] - 41s 221ms/step - loss: 1.5581 - acc: 0.5959 - val_loss: 1.5749 - val_acc: 0.5377
Epoch 34/80
183/183 [==============================] - 41s 223ms/step - loss: 1.5639 - acc: 0.5896 - val_loss: 1.5746 - val_acc: 0.5371
Epoch 35/80
183/183 [==============================] - 41s 223ms/step - loss: 1.5589 - acc: 0.5886 - val_loss: 1.5747 - val_acc: 0.5347
Epoch 36/80
183/183 [==============================] - 40s 216ms/step - loss: 1.5612 - acc: 0.5844 - val_loss: 1.5755 - val_acc: 0.5359
Epoch 37/80
183/183 [==============================] - 39s 216ms/step - loss: 1.5584 - acc: 0.5892 - val_loss: 1.5758 - val_acc: 0.5335
Epoch 38/80
183/183 [==============================] - 39s 215ms/step - loss: 1.5454 - acc: 0.5930 - val_loss: 1.5775 - val_acc: 0.5353
Epoch 39/80
183/183 [==============================] - 39s 215ms/step - loss: 1.5492 - acc: 0.5891 - val_loss: 1.5755 - val_acc: 0.5353
Epoch 40/80
183/183 [==============================] - 39s 214ms/step - loss: 1.5619 - acc: 0.5879 - val_loss: 1.5749 - val_acc: 0.5353
Epoch 41/80
183/183 [==============================] - 39s 214ms/step - loss: 1.5616 - acc: 0.5899 - val_loss: 1.5776 - val_acc: 0.5317
Epoch 42/80
183/183 [==============================] - 40s 217ms/step - loss: 1.5429 - acc: 0.5959 - val_loss: 1.5783 - val_acc: 0.5365
Epoch 43/80
183/183 [==============================] - 39s 214ms/step - loss: 1.5563 - acc: 0.5930 - val_loss: 1.5766 - val_acc: 0.5353
Epoch 44/80
183/183 [==============================] - 39s 214ms/step - loss: 1.5532 - acc: 0.5911 - val_loss: 1.5766 - val_acc: 0.5311
Epoch 45/80
183/183 [==============================] - 39s 214ms/step - loss: 1.5456 - acc: 0.5938 - val_loss: 1.5767 - val_acc: 0.5347
Epoch 46/80
183/183 [==============================] - 39s 213ms/step - loss: 1.5500 - acc: 0.5933 - val_loss: 1.5766 - val_acc: 0.5353
Epoch 47/80
183/183 [==============================] - 39s 214ms/step - loss: 1.5607 - acc: 0.5850 - val_loss: 1.5755 - val_acc: 0.5365
Epoch 48/80
183/183 [==============================] - 39s 214ms/step - loss: 1.5505 - acc: 0.5933 - val_loss: 1.5741 - val_acc: 0.5335
Epoch 49/80
183/183 [==============================] - 39s 214ms/step - loss: 1.5621 - acc: 0.5854 - val_loss: 1.5773 - val_acc: 0.5335
Epoch 50/80
183/183 [==============================] - 39s 214ms/step - loss: 1.5588 - acc: 0.5871 - val_loss: 1.5745 - val_acc: 0.5371
Epoch 51/80
183/183 [==============================] - 39s 214ms/step - loss: 1.5607 - acc: 0.5884 - val_loss: 1.5773 - val_acc: 0.5371
Epoch 52/80
183/183 [==============================] - 39s 214ms/step - loss: 1.5642 - acc: 0.5879 - val_loss: 1.5759 - val_acc: 0.5371
Epoch 53/80
183/183 [==============================] - 39s 214ms/step - loss: 1.5572 - acc: 0.5866 - val_loss: 1.5770 - val_acc: 0.5341
Epoch 54/80
183/183 [==============================] - 39s 214ms/step - loss: 1.5521 - acc: 0.5950 - val_loss: 1.5756 - val_acc: 0.5359
Epoch 55/80
183/183 [==============================] - 39s 214ms/step - loss: 1.5561 - acc: 0.5847 - val_loss: 1.5769 - val_acc: 0.5353
Epoch 56/80
183/183 [==============================] - 39s 214ms/step - loss: 1.5601 - acc: 0.5867 - val_loss: 1.5764 - val_acc: 0.5305
Epoch 57/80
183/183 [==============================] - 39s 214ms/step - loss: 1.5590 - acc: 0.5853 - val_loss: 1.5768 - val_acc: 0.5377
Epoch 58/80
183/183 [==============================] - 39s 214ms/step - loss: 1.5526 - acc: 0.5914 - val_loss: 1.5765 - val_acc: 0.5395
Epoch 59/80
183/183 [==============================] - 39s 214ms/step - loss: 1.5549 - acc: 0.5878 - val_loss: 1.5774 - val_acc: 0.5365
Epoch 60/80
183/183 [==============================] - 39s 214ms/step - loss: 1.5650 - acc: 0.5898 - val_loss: 1.5739 - val_acc: 0.5365
Epoch 61/80
183/183 [==============================] - 39s 214ms/step - loss: 1.5663 - acc: 0.5835 - val_loss: 1.5745 - val_acc: 0.5341
Epoch 62/80
183/183 [==============================] - 39s 214ms/step - loss: 1.5482 - acc: 0.5952 - val_loss: 1.5778 - val_acc: 0.5311
Epoch 63/80
183/183 [==============================] - 39s 214ms/step - loss: 1.5520 - acc: 0.5973 - val_loss: 1.5778 - val_acc: 0.5317
Epoch 64/80
183/183 [==============================] - 39s 214ms/step - loss: 1.5473 - acc: 0.5921 - val_loss: 1.5744 - val_acc: 0.5377
Epoch 65/80
183/183 [==============================] - 39s 214ms/step - loss: 1.5621 - acc: 0.5936 - val_loss: 1.5778 - val_acc: 0.5359
Epoch 66/80
183/183 [==============================] - 39s 214ms/step - loss: 1.5509 - acc: 0.5895 - val_loss: 1.5730 - val_acc: 0.5395
Epoch 67/80
183/183 [==============================] - 39s 214ms/step - loss: 1.5461 - acc: 0.5904 - val_loss: 1.5771 - val_acc: 0.5347
Epoch 68/80

183/183 [==============================] - 39s 214ms/step - loss: 1.5547 - acc: 0.5894 - val_loss: 1.5762 - val_acc: 0.5371
Epoch 69/80
183/183 [==============================] - 39s 214ms/step - loss: 1.5613 - acc: 0.5853 - val_loss: 1.5788 - val_acc: 0.5371
Epoch 70/80
183/183 [==============================] - 39s 214ms/step - loss: 1.5533 - acc: 0.5923 - val_loss: 1.5805 - val_acc: 0.5353
Epoch 71/80
183/183 [==============================] - 39s 216ms/step - loss: 1.5589 - acc: 0.5859 - val_loss: 1.5749 - val_acc: 0.5353
Epoch 72/80
183/183 [==============================] - 39s 214ms/step - loss: 1.5623 - acc: 0.5914 - val_loss: 1.5748 - val_acc: 0.5359
Epoch 73/80
183/183 [==============================] - 39s 216ms/step - loss: 1.5546 - acc: 0.5887 - val_loss: 1.5747 - val_acc: 0.5389
Epoch 74/80
183/183 [==============================] - 40s 220ms/step - loss: 1.5516 - acc: 0.5872 - val_loss: 1.5777 - val_acc: 0.5329
Epoch 75/80
183/183 [==============================] - 39s 215ms/step - loss: 1.5519 - acc: 0.5900 - val_loss: 1.5744 - val_acc: 0.5377
Epoch 76/80
183/183 [==============================] - 39s 213ms/step - loss: 1.5618 - acc: 0.5842 - val_loss: 1.5776 - val_acc: 0.5401
Epoch 77/80
183/183 [==============================] - 39s 214ms/step - loss: 1.5574 - acc: 0.5886 - val_loss: 1.5758 - val_acc: 0.5359
Epoch 78/80
183/183 [==============================] - 39s 214ms/step - loss: 1.5618 - acc: 0.5888 - val_loss: 1.5776 - val_acc: 0.5395
Epoch 79/80
183/183 [==============================] - 39s 214ms/step - loss: 1.5518 - acc: 0.5879 - val_loss: 1.5759 - val_acc: 0.5353
Epoch 80/80
183/183 [==============================] - 39s 215ms/step - loss: 1.5634 - acc: 0.5909 - val_loss: 1.5756 - val_acc: 0.5353
2021-08-10 09:43:14,950 [INFO] main: Total Val Loss: 1.5756317377090454
2021-08-10 09:43:14,950 [INFO] main: Total Val accuracy: 0.5353293418884277
2021-08-10 09:43:14,950 [INFO] main: Training finished successfully.
2021-08-10 17:43:16,446 [INFO] tlt.components.docker_handler.docker_handler: Stopping container.

Please change to another lr_config.
For example,

lr_config {
soft_anneal{
learning_rate: 0.05

soft_start: 0.056
annealing_points: [0.3, 0.6, 0.8]
annealing_divider: 10

}
}

Also, please set a lower batch_size_per_gpu. For example,

batch_size_per_gpu: 8

Thinks! I’ll try it

My settings are as follows:

lr_config {
soft_anneal{
learning_rate: 0.05
soft_start: 0.056
annealing_points:[0.3,0.6,0.8]
annealing_divider: 10
}
}

or
lr_config {
cosine {
learning_rate: 0.04
soft_start: 0.0
}
}

and batch_size_per_gpu:=4 or 8,the result is the same

2917/2917 [==============================] - 86s 29ms/step - loss: 1.6581 - acc: 0.5752 - val_loss: 1.9807 - val_acc: 0.5078
Epoch 77/80
2917/2917 [==============================] - 86s 29ms/step - loss: 1.6588 - acc: 0.5766 - val_loss: 2.0283 - val_acc: 0.4958
Epoch 78/80
2917/2917 [==============================] - 85s 29ms/step - loss: 1.6493 - acc: 0.5818 - val_loss: 2.0050 - val_acc: 0.5036
Epoch 79/80
2917/2917 [==============================] - 86s 29ms/step - loss: 1.6547 - acc: 0.5770 - val_loss: 2.0048 - val_acc: 0.5048
Epoch 80/80
2917/2917 [==============================] - 86s 29ms/step - loss: 1.6567 - acc: 0.5832 - val_loss: 1.9952 - val_acc: 0.5084
2021-08-12 06:52:39,371 [INFO] main: Total Val Loss: 1.9951542615890503
2021-08-12 06:52:39,371 [INFO] main: Total Val accuracy: 0.5083832144737244
2021-08-12 06:52:39,371 [INFO] main: Training finished successfully.
2021-08-12 14:52:40,806 [INFO] tlt.components.docker_handler.docker_handler: Stopping container.

my spec file:
model_config {
arch: “resnet”,
n_layers: 18

Setting these parameters to true to match the template downloaded from NGC.

use_batch_norm: true
all_projections: true
freeze_blocks: 0
freeze_blocks: 1
input_image_size: “3,224,224”
}
train_config {
train_dataset_path: “/workspace/tlt-experiments/data/split/train”
val_dataset_path: “/workspace/tlt-experiments/data/split/val”
pretrained_model_path: “/workspace/tlt-experiments/classification/pretrained_resnet18/tlt_pretrained_classification_vresnet18/resnet_18.hdf5”
optimizer {
sgd {
lr: 0.01
decay: 0.0
momentum: 0.9
nesterov: False
}
}
batch_size_per_gpu: 8
n_epochs: 80
n_workers: 16
preprocess_mode: “caffe”
enable_random_crop: True
enable_center_crop: True
label_smoothing: 0.0
mixup_alpha: 0.1

regularizer

reg_config {
type: “L2”
scope: “Conv2D,Dense”
weight_decay: 0.00005
}

learning_rate

lr_config {
soft_anneal{
learning_rate: 0.05
soft_start: 0.056
annealing_points:[0.3,0.6,0.8]
annealing_divider: 10
}
}
}
eval_config {
eval_dataset_path: “/workspace/tlt-experiments/data/split/test”
model_path: “/workspace/tlt-experiments/classification/output/weights/resnet_080.tlt”
top_k: 3
batch_size: 256
n_workers: 8
enable_center_crop: True
}

How should I modify it?

Which dataset are you training? Is it your own dataset?

The dataset in the official tutorial is used:

OK, you are running with public dataset(Pascal VOC) mentioned in the jupyter notebook.
Please refer to the spec files inside the blog https://developer.nvidia.com/blog/preparing-state-of-the-art-models-for-classification-and-object-detection-with-tlt/. This blog can get SOTA while training TLT Classification network against another public dataset(Imagenet)

My spec files refers to the above ResNet50 . My spec files is as follows:

model_config {
arch: “resnet”
n_layers: 18
use_batch_norm: True
use_bias: False
all_projections: False
use_pooling: True
use_imagenet_head: True
resize_interpolation_method: BICUBIC
input_image_size: “3,224,224”
}
train_config {
train_dataset_path: “/workspace/tlt-experiments/data/split/train”
val_dataset_path: “/workspace/tlt-experiments/data/split/val”
pretrained_model_path: “/workspace/tlt-experiments/classification/pretrained_resnet18/tlt_pretrained_classification_vresnet18/resnet_18.hdf5”

optimizer {
sgd {
lr: 0.05
decay: 0.0
momentum: 0.9
nesterov: False
}
}
batch_size_per_gpu: 8
n_epochs: 120
n_workers: 16
preprocess_mode: “torch”
enable_random_crop: true
enable_center_crop: true
label_smoothing: 0.0

regularizer

reg_config {
type: “L2”
scope: “Conv2D,Dense”
weight_decay: 0.000015
}

learning_rate

lr_config {
cosine {
learning_rate: 0.05
soft_start: 0.0
}
}
}
eval_config {
eval_dataset_path: “/workspace/tlt-experiments/data/split/test”
model_path: “/workspace/tlt-experiments/classification/output/weights/resnet_120.tlt”
top_k: 3
batch_size: 64
n_workers: 8
enable_center_crop: True
}

Then the following error is prompted

Then i set:

preprocess_mode: “caffe”

and train it:

1459/1459 [==============================] - 77s 53ms/step - loss: 2.6583 - acc: 0.2440 - val_loss: 2.5978 - val_acc: 0.2569
Epoch 3/120 1459/1459 [==============================] - 77s 53ms/step - loss: 2.5871 - acc: 0.2449 - val_loss: 2.5089 - val_acc: 0.2437
Epoch 4/120 1459/1459 [==============================] - 77s 53ms/step - loss: 2.5380 - acc: 0.2560 - val_loss: 2.4485 - val_acc: 0.2707
Epoch 5/120 1459/1459 [==============================] - 77s 53ms/step - loss: 2.4880 - acc: 0.2627 - val_loss: 2.3979 - val_acc: 0.2701
Epoch 6/120 1459/1459 [==============================] - 77s 53ms/step - loss: 2.4410 - acc: 0.2730 - val_loss: 2.3296 - val_acc: 0.2922
Epoch 7/120 1459/1459 [==============================] - 77s 53ms/step - loss: 2.4137 - acc: 0.2780 - val_loss: 2.2758 - val_acc: 0.2994

Epoch 8/120 1459/1459 [==============================] - 77s 53ms/step - loss: 2.3812 - acc: 0.2829 - val_loss: 2.3034 - val_acc: 0.3006
Epoch 9/120 1459/1459 [==============================] - 77s 53ms/step - loss: 2.3550 - acc: 0.2900 - val_loss: 2.2490 - val_acc: 0.3114
Epoch 10/120 1459/1459 [==============================] - 77s 53ms/step - loss: 2.3210 - acc: 0.2967 - val_loss: 2.2092 - val_acc: 0.3240
Epoch 11/120 1459/1459 [==============================] - 77s 53ms/step - loss: 2.2912 - acc: 0.3078 - val_loss: 2.1876 - val_acc: 0.3335
Epoch 12/120 1459/1459 [==============================] - 77s 53ms/step - loss: 2.2769 - acc: 0.3106 - val_loss: 2.2292 - val_acc: 0.3275
Epoch 13/120 1459/1459 [==============================] - 77s 53ms/step - loss: 2.2599 - acc: 0.3096 - val_loss: 2.2038 - val_acc: 0.3144
Epoch 14/120 1459/1459 [==============================] - 77s 53ms/step - loss: 2.2333 - acc: 0.3231 - val_loss: 2.2211 - val_acc: 0.3216
Epoch 15/120 1459/1459 [==============================] - 77s 53ms/step - loss: 2.2039 - acc: 0.3348 - val_loss: 2.1316 - val_acc: 0.3509
Epoch 16/120 1459/1459 [==============================] - 77s 53ms/step - loss: 2.1859 - acc: 0.3392 - val_loss: 2.1328 - val_acc: 0.3695
Epoch 17/120 1459/1459 [==============================] - 77s 53ms/step - loss: 2.1829 - acc: 0.3404 - val_loss: 2.1672 - val_acc: 0.3443
Epoch 18/120 1459/1459 [==============================] - 77s 53ms/step - loss: 2.1784 - acc: 0.3496 - val_loss: 2.1075 - val_acc: 0.3749
Epoch 19/120 1459/1459 [==============================] - 77s 53ms/step - loss: 2.1523 - acc: 0.3506 - val_loss: 2.0896 - val_acc: 0.3713
Epoch 20/120 1459/1459 [==============================] - 77s 53ms/step - loss: 2.1351 - acc: 0.3528 - val_loss: 2.0488 - val_acc: 0.3934
Epoch 21/120 1459/1459 [==============================] - 77s 53ms/step - loss: 2.1261 - acc: 0.3611 - val_loss: 2.0480 - val_acc: 0.3838
Epoch 22/120 1459/1459 [==============================] - 77s 53ms/step - loss: 2.1162 - acc: 0.3618 - val_loss: 2.0941 - val_acc: 0.3719
Epoch 23/120 1459/1459 [==============================] - 77s 53ms/step - loss: 2.1007 - acc: 0.3688 - val_loss: 2.0847 - val_acc: 0.3629
Epoch 24/120 1459/1459 [==============================] - 77s 53ms/step - loss: 2.1010 - acc: 0.3729 - val_loss: 2.0229 - val_acc: 0.3994
Epoch 25/120 1459/1459 [==============================] - 77s 53ms/step - loss: 2.0819 - acc: 0.3815 - val_loss: 2.0216 - val_acc: 0.4030
Epoch 26/120 1459/1459 [==============================] - 77s 53ms/step - loss: 2.0704 - acc: 0.3792 - val_loss: 2.1126 - val_acc: 0.3772
Epoch 27/120 1459/1459 [==============================] - 77s 53ms/step - loss: 2.0687 - acc: 0.3780 - val_loss: 2.1212 - val_acc: 0.3856
Epoch 28/120 1459/1459 [==============================] - 77s 53ms/step - loss: 2.0664 - acc: 0.3863 - val_loss: 2.0226 - val_acc: 0.4054
Epoch 29/120 1459/1459 [==============================] - 77s 53ms/step - loss: 2.0407 - acc: 0.3909 - val_loss: 2.0363 - val_acc: 0.3988
Epoch 30/120 1459/1459 [==============================] - 77s 53ms/step - loss: 2.0348 - acc: 0.3964 - val_loss: 2.0367 - val_acc: 0.4162
Epoch 31/120 1459/1459 [==============================] - 77s 53ms/step - loss: 2.0340 - acc: 0.3916 - val_loss: 2.0574 - val_acc: 0.3862
Epoch 32/120 1459/1459 [==============================] - 77s 53ms/step - loss: 2.0235 - acc: 0.3936 - val_loss: 2.1327 - val_acc: 0.3898
Epoch 33/120 1459/1459 [==============================] - 77s 53ms/step - loss: 2.0061 - acc: 0.4000 - val_loss: 2.0479 - val_acc: 0.3928
Epoch 34/120 1459/1459 [==============================] - 77s 53ms/step - loss: 2.0097 - acc: 0.4027 - val_loss: 2.0329 - val_acc: 0.3958
Epoch 35/120 1459/1459 [==============================] - 77s 53ms/step - loss: 1.9988 - acc: 0.4094 - val_loss: 2.0076 - val_acc: 0.4240
Epoch 36/120 1459/1459 [==============================] - 77s 53ms/step - loss: 1.9819 - acc: 0.4108 - val_loss: 2.0119 - val_acc: 0.4138
Epoch 37/120 1459/1459 [==============================] - 77s 53ms/step - loss: 1.9699 - acc: 0.4181 - val_loss: 2.0248 - val_acc: 0.4234
Epoch 38/120 1459/1459 [==============================] - 77s 53ms/step - loss: 1.9612 - acc: 0.4190 - val_loss: 2.0359 - val_acc: 0.4150
Epoch 39/120 1459/1459 [==============================] - 77s 53ms/step - loss: 1.9724 - acc: 0.4190 - val_loss: 2.0179 - val_acc: 0.4317
Epoch 40/120 1459/1459 [==============================] - 77s 53ms/step - loss: 1.9524 - acc: 0.4269 - val_loss: 1.9906 - val_acc: 0.4168
Epoch 41/120 1459/1459 [==============================] - 77s 53ms/step - loss: 1.9420 - acc: 0.4307 - val_loss: 2.1150 - val_acc: 0.4018
Epoch 42/120 1459/1459 [==============================] - 77s 53ms/step - loss: 1.9412 - acc: 0.4329 - val_loss: 2.0282 - val_acc: 0.4156
Epoch 43/120 1459/1459 [==============================] - 77s 53ms/step - loss: 1.9249 - acc: 0.4364 - val_loss: 2.0378 - val_acc: 0.4054
Epoch 44/120 1459/1459 [==============================] - 77s 53ms/step - loss: 1.9174 - acc: 0.4354 - val_loss: 1.9994 - val_acc: 0.4317
Epoch 45/120 1459/1459 [==============================] - 77s 53ms/step - loss: 1.9210 - acc: 0.4379 - val_loss: 2.0309 - val_acc: 0.4072
Epoch 46/120 1459/1459 [==============================] - 77s 53ms/step - loss: 1.8993 - acc: 0.4497 - val_loss: 2.0352 - val_acc: 0.4168
Epoch 47/120 1459/1459 [==============================] - 77s 53ms/step - loss: 1.9035 - acc: 0.4462 - val_loss: 2.0201 - val_acc: 0.4275
Epoch 48/120 1459/1459 [==============================] - 77s 53ms/step - loss: 1.8831 - acc: 0.4525 - val_loss: 1.9763 - val_acc: 0.4503
Epoch 49/120 1459/1459 [==============================] - 77s 53ms/step - loss: 1.8754 - acc: 0.4515 - val_loss: 1.9861 - val_acc: 0.4461
Epoch 50/120 1459/1459 [==============================] - 77s 53ms/step - loss: 1.8792 - acc: 0.4539 - val_loss: 2.0341 - val_acc: 0.4293
Epoch 51/120 1459/1459 [==============================] - 77s 53ms/step - loss: 1.8713 - acc: 0.4566 - val_loss: 2.0751 - val_acc: 0.4138
Epoch 52/120 1459/1459 [==============================] - 77s 53ms/step - loss: 1.8625 - acc: 0.4603 - val_loss: 2.0605 - val_acc: 0.4329
Epoch 53/120 1459/1459 [==============================] - 77s 53ms/step - loss: 1.8515 - acc: 0.4664 - val_loss: 2.0433 - val_acc: 0.4240
Epoch 54/120 1459/1459 [==============================] - 77s 53ms/step - loss: 1.8355 - acc: 0.4720 - val_loss: 2.0389 - val_acc: 0.4401
Epoch 55/120 1459/1459 [==============================] - 77s 53ms/step - loss: 1.8307 - acc: 0.4690 - val_loss: 2.0117 - val_acc: 0.4419
Epoch 56/120 1459/1459 [==============================] - 77s 53ms/step - loss: 1.8273 - acc: 0.4766 - val_loss: 2.0514 - val_acc: 0.4210
Epoch 57/120 1459/1459 [==============================] - 77s 53ms/step - loss: 1.8130 - acc: 0.4790 - val_loss: 2.0740 - val_acc: 0.4365
Epoch 58/120 1459/1459 [==============================] - 77s 53ms/step - loss: 1.8176 - acc: 0.4750 - val_loss: 2.0346 - val_acc: 0.4246
Epoch 59/120 1459/1459 [==============================] - 77s 53ms/step - loss: 1.7864 - acc: 0.4899 - val_loss: 2.1024 - val_acc: 0.4323
Epoch 60/120 1459/1459 [==============================] - 77s 53ms/step - loss: 1.7905 - acc: 0.4915 - val_loss: 1.9912 - val_acc: 0.4347
Epoch 61/120 1459/1459 [==============================] - 77s 53ms/step - loss: 1.7777 - acc: 0.4903 - val_loss: 2.0861 - val_acc: 0.4114
Epoch 62/120 1459/1459 [==============================] - 77s 53ms/step - loss: 1.7708 - acc: 0.4932 - val_loss: 2.1747 - val_acc: 0.4102
Epoch 63/120 1459/1459 [==============================] - 77s 53ms/step - loss: 1.7443 - acc: 0.4997 - val_loss: 2.0769 - val_acc: 0.4305
Epoch 64/120 1459/1459 [==============================] - 77s 53ms/step - loss: 1.7525 - acc: 0.4969 - val_loss: 2.1134 - val_acc: 0.4251
Epoch 65/120 1459/1459 [==============================] - 77s 53ms/step - loss: 1.7441 - acc: 0.4945 - val_loss: 2.0988 - val_acc: 0.4180
Epoch 66/120 1459/1459 [==============================] - 77s 53ms/step - loss: 1.7291 - acc: 0.5054 - val_loss: 2.0759 - val_acc: 0.4168
Epoch 67/120

1459/1459 [==============================] - 77s 53ms/step - loss: 1.7190 - acc: 0.5132 - val_loss: 2.0961 - val_acc: 0.4299
Epoch 68/120 1459/1459 [==============================] - 77s 53ms/step - loss: 1.7063 - acc: 0.5091 - val_loss: 2.0537 - val_acc: 0.4281
Epoch 69/120 1459/1459 [==============================] - 77s 53ms/step - loss: 1.6988 - acc: 0.5170 - val_loss: 2.0661 - val_acc: 0.4210
Epoch 70/120 1459/1459 [==============================] - 77s 53ms/step - loss: 1.6844 - acc: 0.5173 - val_loss: 2.0837 - val_acc: 0.4353
Epoch 71/120 1459/1459 [==============================] - 77s 53ms/step - loss: 1.6663 - acc: 0.5202 - val_loss: 2.1386 - val_acc: 0.4293
Epoch 72/120 1459/1459 [==============================] - 77s 53ms/step - loss: 1.6578 - acc: 0.5240 - val_loss: 2.1105 - val_acc: 0.4192
Epoch 73/120 1459/1459 [==============================] - 77s 53ms/step - loss: 1.6371 - acc: 0.5315 - val_loss: 2.1689 - val_acc: 0.4228
Epoch 74/120 1459/1459 [==============================] - 77s 53ms/step - loss: 1.6317 - acc: 0.5358 - val_loss: 2.1762 - val_acc: 0.4299
Epoch 75/120 1459/1459 [==============================] - 77s 53ms/step - loss: 1.6161 - acc: 0.5442 - val_loss: 2.2001 - val_acc: 0.4192
Epoch 76/120 1459/1459 [==============================] - 77s 53ms/step - loss: 1.6098 - acc: 0.5376 - val_loss: 2.2024 - val_acc: 0.3988
Epoch 77/120 1459/1459 [==============================] - 77s 53ms/step - loss: 1.5977 - acc: 0.5467 - val_loss: 2.1887 - val_acc: 0.4246
Epoch 78/120 1459/1459 [==============================] - 77s 53ms/step - loss: 1.5704 - acc: 0.5545 - val_loss: 2.2151 - val_acc: 0.4228
Epoch 79/120 1459/1459 [==============================] - 77s 53ms/step - loss: 1.5716 - acc: 0.5528 - val_loss: 2.2029 - val_acc: 0.4096
Epoch 80/120 1459/1459 [==============================] - 77s 53ms/step - loss: 1.5705 - acc: 0.5542 - val_loss: 2.2774 - val_acc: 0.4144
Epoch 81/120 1459/1459 [==============================] - 77s 53ms/step - loss: 1.5499 - acc: 0.5579 - val_loss: 2.1935 - val_acc: 0.4018
Epoch 82/120 1459/1459 [==============================] - 77s 53ms/step - loss: 1.5490 - acc: 0.5548 - val_loss: 2.2476 - val_acc: 0.4096
Epoch 83/120 1459/1459 [==============================] - 77s 53ms/step - loss: 1.5378 - acc: 0.5611 - val_loss: 2.2558 - val_acc: 0.4072
Epoch 84/120 1459/1459 [==============================] - 77s 53ms/step - loss: 1.5017 - acc: 0.5748 - val_loss: 2.2876 - val_acc: 0.3886
Epoch 85/120 1459/1459 [==============================] - 77s 53ms/step - loss: 1.4908 - acc: 0.5767 - val_loss: 2.2855 - val_acc: 0.4048
Epoch 86/120 1459/1459 [==============================] - 77s 53ms/step - loss: 1.4817 - acc: 0.5785 - val_loss: 2.3294 - val_acc: 0.3928
Epoch 87/120 1459/1459 [==============================] - 77s 53ms/step - loss: 1.4663 - acc: 0.5818 - val_loss: 2.3389 - val_acc: 0.3868
Epoch 88/120 1459/1459 [==============================] - 77s 53ms/step - loss: 1.4623 - acc: 0.5857 - val_loss: 2.3066 - val_acc: 0.3916
Epoch 89/120 1459/1459 [==============================] - 77s 53ms/step - loss: 1.4549 - acc: 0.5832 - val_loss: 2.4114 - val_acc: 0.3862
Epoch 90/120 1459/1459 [==============================] - 77s 53ms/step - loss: 1.4346 - acc: 0.5903 - val_loss: 2.4058 - val_acc: 0.3910
Epoch 91/120 1459/1459 [==============================] - 77s 53ms/step - loss: 1.4255 - acc: 0.5920 - val_loss: 2.4914 - val_acc: 0.3844
Epoch 92/120 1459/1459 [==============================] - 77s 53ms/step - loss: 1.4061 - acc: 0.6012 - val_loss: 2.5144 - val_acc: 0.3719
Epoch 93/120 1459/1459 [==============================] - 77s 53ms/step - loss: 1.4008 - acc: 0.5977 - val_loss: 2.4486 - val_acc: 0.3904
Epoch 94/120 1459/1459 [==============================] - 77s 53ms/step - loss: 1.3715 - acc: 0.6066 - val_loss: 2.4865 - val_acc: 0.3880
Epoch 95/120 1459/1459 [==============================] - 77s 53ms/step - loss: 1.3675 - acc: 0.6097 - val_loss: 2.4695 - val_acc: 0.3844
Epoch 96/120 1459/1459 [==============================] - 77s 53ms/step - loss: 1.3406 - acc: 0.6097 - val_loss: 2.5387 - val_acc: 0.3719
Epoch 97/120 1459/1459 [==============================] - 77s 53ms/step - loss: 1.3255 - acc: 0.6180 - val_loss: 2.5366 - val_acc: 0.3695
Epoch 98/120 1459/1459 [==============================] - 77s 53ms/step - loss: 1.3206 - acc: 0.6206 - val_loss: 2.6093 - val_acc: 0.3665
Epoch 99/120 1459/1459 [==============================] - 77s 53ms/step - loss: 1.3243 - acc: 0.6190 - val_loss: 2.5337 - val_acc: 0.3719
Epoch 100/120 1459/1459 [==============================] - 77s 53ms/step - loss: 1.2879 - acc: 0.6283 - val_loss: 2.6548 - val_acc: 0.3713
Epoch 101/120 1459/1459 [==============================] - 77s 53ms/step - loss: 1.2938 - acc: 0.6237 - val_loss: 2.6170 - val_acc: 0.3617
Epoch 102/120 1459/1459 [==============================] - 78s 53ms/step - loss: 1.2718 - acc: 0.6307 - val_loss: 2.6590 - val_acc: 0.3754
Epoch 103/120 1459/1459 [==============================] - 77s 53ms/step - loss: 1.2621 - acc: 0.6357 - val_loss: 2.6270 - val_acc: 0.3725
Epoch 104/120 1459/1459 [==============================] - 77s 53ms/step - loss: 1.2506 - acc: 0.6392 - val_loss: 2.6640 - val_acc: 0.3623
Epoch 105/120 1459/1459 [==============================] - 77s 53ms/step - loss: 1.2333 - acc: 0.6429 - val_loss: 2.7536 - val_acc: 0.3593
Epoch 106/120 1459/1459 [==============================] - 77s 53ms/step - loss: 1.2234 - acc: 0.6489 - val_loss: 2.7582 - val_acc: 0.3581
Epoch 107/120 1459/1459 [==============================] - 77s 53ms/step - loss: 1.2326 - acc: 0.6421 - val_loss: 2.7256 - val_acc: 0.3665
Epoch 108/120 1459/1459 [==============================] - 77s 53ms/step - loss: 1.2144 - acc: 0.6494 - val_loss: 2.7430 - val_acc: 0.3545
Epoch 109/120 1459/1459 [==============================] - 77s 53ms/step - loss: 1.2060 - acc: 0.6450 - val_loss: 2.7881 - val_acc: 0.3533
Epoch 110/120 1459/1459 [==============================] - 78s 53ms/step - loss: 1.1958 - acc: 0.6506 - val_loss: 2.7936 - val_acc: 0.3569
Epoch 111/120 1459/1459 [==============================] - 78s 53ms/step - loss: 1.1885 - acc: 0.6560 - val_loss: 2.8056 - val_acc: 0.3551
Epoch 112/120 1459/1459 [==============================] - 77s 53ms/step - loss: 1.1901 - acc: 0.6532 - val_loss: 2.8123 - val_acc: 0.3485
Epoch 113/120 1459/1459 [==============================] - 78s 53ms/step - loss: 1.1692 - acc: 0.6651 - val_loss: 2.8647 - val_acc: 0.3425
Epoch 114/120 1459/1459 [==============================] - 78s 53ms/step - loss: 1.1728 - acc: 0.6597 - val_loss: 2.8762 - val_acc: 0.3467
Epoch 115/120 1459/1459 [==============================] - 78s 53ms/step - loss: 1.1604 - acc: 0.6678 - val_loss: 2.8392 - val_acc: 0.3473
Epoch 116/120 1459/1459 [==============================] - 78s 53ms/step - loss: 1.1585 - acc: 0.6695 - val_loss: 2.8372 - val_acc: 0.3521
Epoch 117/120 1459/1459 [==============================] - 78s 53ms/step - loss: 1.1679 - acc: 0.6664 - val_loss: 2.8419 - val_acc: 0.3545
Epoch 118/120 1459/1459 [==============================] - 78s 53ms/step - loss: 1.1596 - acc: 0.6607 - val_loss: 2.8294 - val_acc: 0.3539
Epoch 119/120 1459/1459 [==============================] - 78s 53ms/step - loss: 1.1653 - acc: 0.6595 - val_loss: 2.8524 - val_acc: 0.3509
Epoch 120/120 1459/1459 [==============================] - 78s 53ms/step - loss: 1.1686 - acc: 0.6593 - val_loss: 2.8629 - val_acc: 0.3497
2021-08-15 13:25:52,265 [INFO] main: Total Val Loss: 2.862879753112793
2021-08-15 13:25:52,265 [INFO] main: Total Val accuracy: 0.34970059990882874
2021-08-15 13:25:52,265 [INFO] main: Training finished successfully.
2021-08-15 21:25:54,514 [INFO] tlt.components.docker_handler.docker_handler: Stopping container.

How can I optimize it

Supplement:

learning_rate:0.005
The results are as follows:
2021-08-15 16:15:49,017 [INFO] main: Total Val Loss: 2.9649219512939453
2021-08-15 16:15:49,017 [INFO] main: Total Val accuracy: 0.37964072823524475

learning_rate:0.001
The results are as follows:
2021-08-16 03:23:17,321 [INFO] main: Total Val Loss: 2.764241933822632
2021-08-16 03:23:17,322 [INFO] main: Total Val accuracy: 0.37185630202293396

learning_rate:0.0001
The results are as follows:
2021-08-16 06:07:00,108 [INFO] main: Total Val Loss: 1.75017511844635
2021-08-16 06:07:00,108 [INFO] main: Total Val accuracy: 0.473053902387619

I will try to reproduce your result. Currently, you are getting accuracy of 0.5353 when you train a resnet18 classification model against VOC dataset.

OK!
Maybe I made a mistake.

I can reproduce your result. Checking internally and will update to you when there is updated info.

ok,thanks!