Description
I had a setup for training using the object detection API that worked really well, however I have had to upgrade from TF1.15 to TF2 and so instead of using model_main.py I am now using model_main_tf2.py and using mobilenet ssd 320x320 pipeline to transfer train a new model.
When training my model in TF1.15 it would display a whole heap of scalars as well as detection box image samples. It was fantastic.
In TF2 training I get no such data, just loss scalars and 3 input images!! and yet the event files are huge gigabytes! where as they were in hundreds of megs using TF1.15
The thing is there is nowhere to specify what data is presented. I have not changed anything other than which model_main py file I use to run the training. I added num_visualizations: to the pipeline config file but no visualizations of detection boxes appear.
Can someone please explain to me what is going on? I need to be able to see whats happening throughout training!
Thank You
I am training on PC in virtual environment before performing TRT optimization in Linux but I think that is irrelevant here really.
Environment
GPU Type: P220
Operating System + Version: Win10 Pro
Python Version (if applicable): 3.6
TensorFlow Version (if applicable): 2
Relevant Files
TF1.15 vs TF2 screenshots:
Steps To Reproduce
The repo I am working with https://github.com/tensorflow/models/tree/master/research/object_detection
pipeline config:
# SSD with Mobilenet v2
# Trained on COCO17, initialized from Imagenet classification checkpoint
# Train on TPU-8
#
# Achieves 22.2 mAP on COCO17 Val
model {
ssd {
inplace_batchnorm_update: true
freeze_batchnorm: false
num_classes: 2
box_coder {
faster_rcnn_box_coder {
y_scale: 10.0
x_scale: 10.0
height_scale: 5.0
width_scale: 5.0
}
}
matcher {
argmax_matcher {
matched_threshold: 0.5
unmatched_threshold: 0.5
ignore_thresholds: false
negatives_lower_than_unmatched: true
force_match_for_each_row: true
use_matmul_gather: true
}
}
similarity_calculator {
iou_similarity {
}
}
encode_background_as_zeros: true
anchor_generator {
ssd_anchor_generator {
num_layers: 6
min_scale: 0.2
max_scale: 0.95
aspect_ratios: 1.0
aspect_ratios: 2.0
aspect_ratios: 0.5
aspect_ratios: 3.0
aspect_ratios: 0.3333
}
}
image_resizer {
fixed_shape_resizer {
height: 300
width: 300
}
}
box_predictor {
convolutional_box_predictor {
min_depth: 0
max_depth: 0
num_layers_before_predictor: 0
use_dropout: false
dropout_keep_probability: 0.8
kernel_size: 1
box_code_size: 4
apply_sigmoid_to_scores: false
class_prediction_bias_init: -4.6
conv_hyperparams {
activation: RELU_6,
regularizer {
l2_regularizer {
weight: 0.00004
}
}
initializer {
random_normal_initializer {
stddev: 0.01
mean: 0.0
}
}
batch_norm {
train: true,
scale: true,
center: true,
decay: 0.97,
epsilon: 0.001,
}
}
}
}
feature_extractor {
type: 'ssd_mobilenet_v2_keras'
min_depth: 16
depth_multiplier: 1.0
conv_hyperparams {
activation: RELU_6,
regularizer {
l2_regularizer {
weight: 0.00004
}
}
initializer {
truncated_normal_initializer {
stddev: 0.03
mean: 0.0
}
}
batch_norm {
train: true,
scale: true,
center: true,
decay: 0.97,
epsilon: 0.001,
}
}
override_base_feature_extractor_hyperparams: true
}
loss {
classification_loss {
weighted_sigmoid_focal {
alpha: 0.75,
gamma: 2.0
}
}
localization_loss {
weighted_smooth_l1 {
delta: 1.0
}
}
classification_weight: 1.0
localization_weight: 1.0
}
normalize_loss_by_num_matches: true
normalize_loc_loss_by_codesize: true
post_processing {
batch_non_max_suppression {
score_threshold: 1e-8
iou_threshold: 0.6
max_detections_per_class: 100
max_total_detections: 100
}
score_converter: SIGMOID
}
}
}
train_config: {
fine_tune_checkpoint_version: V2
fine_tune_checkpoint: "legacy/ssd_mobilenet_v2_320x320_coco17/checkpoint/ckpt-0"
fine_tune_checkpoint_type: "detection"
batch_size: 12
sync_replicas: true
startup_delay_steps: 0
replicas_to_aggregate: 8
num_steps: 70000
data_augmentation_options {
random_horizontal_flip {
}
}
data_augmentation_options {
ssd_random_crop {
}
}
optimizer {
momentum_optimizer: {
learning_rate: {
cosine_decay_learning_rate {
learning_rate_base: .8
total_steps: 70000
warmup_learning_rate: 0.13333
warmup_steps: 2000
}
}
momentum_optimizer_value: 0.9
}
use_moving_average: false
}
max_number_of_boxes: 100
unpad_groundtruth_tensors: false
}
train_input_reader: {
label_map_path: "legacy/training/object-detection.pbtxt"
tf_record_input_reader {
input_path: "legacy/data/train.record"
}
}
eval_config: {
metrics_set: "coco_detection_metrics"
retain_original_images: false
use_moving_averages: false
num_visualizations: 45
min_score_threshold: 0.35
max_evals: 10
}
eval_input_reader: {
label_map_path: "legacy/training/object-detection.pbtxt"
shuffle: false
num_epochs: 1
tf_record_input_reader {
input_path: "legacy/data/test.record"
}
}
UPDATE: I have investigated further and discovered that the tensorboard settings are being set in https://github.com/tensorflow/models/blob/master/research/object_detection/model_lib.py for TF1.15 and https://github.com/tensorflow/models/blob/master/research/object_detection/model_lib_v2.py for TF2
So if someone who knows more than I do about this could work out what the difference is and what I need to do to get same result in tensorboard with v2 as I do with the first one that would be amazing and save me enormous headache. It would seem that this, even though it is documented as being for TF2, is not actually following TF2 syntax but I could be wrong.