Training does not converge well in classification training

I am training two classes classification and training doesn’t converge very well.
Training data size is 17000 for each class.
How can I improve training?
training.log (2.0 KB) classification_retrain_spec.log (1.1 KB)

Please consider decreasing batch-size.
Also, please consider changing to

lr_config {
#scheduler: “step”
#learning_rate: 0.006
#step_size: 10
#gamma: 0.1

scheduler: "soft_anneal"
learning_rate: 0.05

soft_start: 0.056
annealing_points: "0.3, 0.6, 0.8"
annealing_divider: 10


May I know what the differences are between the two configurations? The new configuration works very well for my data.

For “step”, it implements the step learning rate annnealing schedule according to the progress of the training. The scheduler adjusts the learning rate of the experiment in steps at regular intervals.
The learning rate will reduces at every step.

For “soft_anneal”, this learning rate scheduler adjusts learning rate in the following phases:

     Phase 1: 0.0 <= progress < soft_start:
               Starting from start_lr linearly increase the learning rate to base_lr.
     Phase 2: at every annealing point, divide learning rate by annealing divider.