No detections after training PeopleNet using custom labeled data

gabe_ddi · August 3, 2020, 5:58am

I am trying to improve the performance of the PeopleNet model using around 1300 labeled 1920x1080 png images.

I have used the following command
tlt-train detectnet_v2 -k tlt_encode -r /workspace/tlt-experiments/ -e train.txt

My train.txt file is:

random_seed: 42
model_config {
  num_layers: 18
  pretrained_model_file: "/workspace/tlt-experiments/resnet34_peoplenet.tlt"
  use_batch_norm: true
  objective_set {
    bbox {
      scale: 35.0
      offset: 0.5
    }
    cov {
    }
  }
  training_precision {
    backend_floatx: FLOAT32
  }
  arch: "resnet"
  all_projections: true
}
# Sample rasterizer configs to instantiate a 3 class bbox rasterizer
bbox_rasterizer_config {
  target_class_config {
    key: "person"
    value: {
      cov_center_x: 0.5
      cov_center_y: 0.5
      cov_radius_x: 0.4
      cov_radius_y: 0.4
      bbox_min_radius: 1.0
    }
  }
  deadzone_radius: 0.67
}
postprocessing_config {
  target_class_config {
    key: "person"
    value: {
      clustering_config {
        coverage_threshold: 0.005
        dbscan_eps: 0.15
        dbscan_min_samples: 0.05
        minimum_bounding_box_height: 20
      }
    }
  }
}
cost_function_config {
  target_classes {
    name: "person"
    class_weight: 1.0
    coverage_foreground_weight: 0.05
    objectives {
      name: "cov"
      initial_weight: 1.0
      weight_target: 1.0
    }
    objectives {
      name: "bbox"
      initial_weight: 10.0
      weight_target: 10.0
    }
  }
  enable_autoweighting: True
  max_objective_weight: 0.9999
  min_objective_weight: 0.0001
}
training_config {
  batch_size_per_gpu: 8
  num_epochs: 80
  learning_rate {
    soft_start_annealing_schedule {
      min_learning_rate: 5e-6
      max_learning_rate: 5e-4
      soft_start: 0.1
      annealing: 0.7
    }
  }
  regularizer {
    type: L1
    weight: 3e-9
  }
  optimizer {
    adam {
      epsilon: 1e-08
      beta1: 0.9
      beta2: 0.999
    }
  }
  cost_scaling {
    enabled: False
    initial_exponent: 20.0
    increment: 0.005
    decrement: 1.0
  }
}
# Sample augementation config for 
augmentation_config {
  preprocessing {
    output_image_width: 960
    output_image_height: 544
    output_image_channel: 3
    min_bbox_width: 1.0
    min_bbox_height: 1.0
  }
  spatial_augmentation {

    hflip_probability: 0.5
    vflip_probability: 0.0
    zoom_min: 1.0
    zoom_max: 1.0
    translate_max_x: 8.0
    translate_max_y: 8.0
  }
  color_augmentation {
    color_shift_stddev: 0.0
    hue_rotation_max: 25.0
    saturation_shift_max: 0.2
    contrast_scale_max: 0.1
    contrast_center: 0.5
  }
}
evaluation_config {
  average_precision_mode: INTEGRATE
  validation_period_during_training: 10
  first_validation_epoch: 1
  minimum_detection_ground_truth_overlap {
    key: "person"
    value: 0.5
  }
  evaluation_box_config {
    key: "person"
    value {
      minimum_height: 4
      maximum_height: 9999
      minimum_width: 4
      maximum_width: 9999
    }
  }
}
dataset_config {
  data_sources: {
    tfrecords_path: "/workspace/tlt-experiments/tf_records/*"
    image_directory_path: "/workspace/tlt-experiments/"
  }
  image_extension: "png"
  target_class_mapping {
      key: "person"
      value: "person"
  }
  validation_fold: 0
}

The results of training at 80 epochs are:

Epoch 80/80
=========================

Validation cost: 0.000043
Mean average_precision (in %): 98.6850

class name      average precision (in %)
------------  --------------------------
person                            98.685

Median Inference Time: 0.013576

Understand that for deployment I would use prune but just wanted to check accuracy on site camera so used the below to export the model for deepstream 5.0dp:

tlt-export detectnet_v2 -m /workspace/tlt-experiments/weights/model.tlt -o /workspace/tlt-experiments/weights/peoplenet_detector_unpruned.etlt -k tlt_encode

I get the following:

Using TensorFlow backend.
NOTE: UFF has been tested with TensorFlow 1.14.0.
WARNING: The version of TensorFlow installed on this system is not guaranteed to work with UFF.
DEBUG [/usr/lib/python2.7/dist-packages/uff/converters/tensorflow/converter.py:96] Marking ['output_cov/Sigmoid', 'output_bbox/BiasAdd'] as outputs
[TensorRT] INFO: Some tactics do not have sufficient workspace memory to run. Increasing workspace size may increase performance, please check verbose output.
[TensorRT] INFO: Detected 1 inputs and 2 output network tensors.

Then I use the same deepstream code that was running the standard PeopleNet model but with changes as shown below:

[property]
gpu-id=0
net-scale-factor=0.0039215697906911373
tlt-model-key=tlt_encode
tlt-encoded-model=/home/ddi/Social%20Distancing/CampsieRSL/deepstream/dev/local-testing/peoplenet_detector_unpruned.etlt
#tlt-encoded-model=/opt/nvidia/deepstream/deepstream-5.0/samples/models/tlt_pretrained_models/peoplenet/resnet34_peoplenet_pruned.etlt
labelfile-path=labels_peoplenet.txt
#model-engine-file=/opt/nvidia/deepstream/deepstream-5.0/samples/models/tlt_pretrained_models/peoplenet/resnet34_peoplenet_pruned.etlt_b1_gpu0_fp16.engine
input-dims=3;544;960;0
uff-input-blob-name=input_1
batch-size=1
process-mode=1
model-color-format=0
## 0=FP32, 1=INT8, 2=FP16 mode
network-mode=2
num-detected-classes=1
cluster-mode=1
interval=0
gie-unique-id=1
output-blob-names=output_bbox/BiasAdd;output_cov/Sigmoid

[class-attrs-all]
pre-cluster-threshold=0.2
## Set eps=0.7 and minBoxes for cluster-mode=1(DBSCAN)
eps=0.7
minBoxes=1

Have changed the number of classes to just 1 for person, label file is just person now. When I run the deepstream app it does not detect a single person unless I put pre-cluser-threshold<0.1 which gives mostly false positives.

Have I missed something? Does it matter that I am only using 1 class? Does the traning image need to be 960x544 and not 1920x1080?

Morganh · August 3, 2020, 7:23am

Can you run tlt-infer against the test dataset firstly? To check its output folder.

More,

It is necessary to resize your images/labels to 960x544 offline. Or you can keep your images/labels, but need to set to 1920x1088 in the spec. The width and height should be multiple of 16.
Need to change

num_layers: 18

to

num_layers: 34

gabe_ddi · August 3, 2020, 11:47pm

Yeah the results of tlt-infer were very poor.

I will try keep labels and images and change spec to 1920x1088 in spec and layers to 34 and retrain and see.

If that doesnt work I will rezise to 960x544

gabe_ddi · August 6, 2020, 5:34am

So ran multiple tests of training using 1920x1088 and 960x544 changing layers to 34. Both had worse performance than the normal PeopleNet model even if using unpruned. Is there anything else I need to change in my train.txt to improve performance or is it a case of more data and a mix of data. Also is it recommended to use images with no people/labels to improve accuracy?

Morganh · August 6, 2020, 5:41am

I observe that during your training, it can get a high mAP result(98.6850). This result is generated by tlt-evaluate.
So, please run a quick test. Use tlt-infer to run inference against the same val dataset(it should be part 0 of your /workspace/tlt-experiments/tf_records/* because you set validation_fold: 0 in the spec ). To see if tlt-infer can get the same mAP result as tlt-evaluate.

H19012 · April 2, 2021, 12:40am

I am having the same issues, I get high mAP but inference on DeepStream is not good

Morganh · April 2, 2021, 12:43am

@H19012
Please create a new forum topic. Thanks.

Topic		Replies	Views
Retraining peoplenet model with own images TAO Toolkit	43	1563	October 12, 2021
Run PeopleNet with tensorrt TAO Toolkit	35	9698	August 10, 2021
TLT trained model accuracy worse after deployment TAO Toolkit	11	825	October 12, 2021
ZeroDivisionError when training peoplenet TAO Toolkit	10	585	October 12, 2021
PeopleNet v1.0 unpruned model shows very bad results on COCO dataset TAO Toolkit	12	1848	October 12, 2021
Accuracy goes to 0% when Pruning PeopleNet 2.6 TAO Toolkit	10	707	March 23, 2023
Accelerating Peoplnet with tlt for jetson nano TAO Toolkit	19	2401	October 12, 2021
People Net - TAO Toolkit	28	3811	October 12, 2021
Models trained using TLT perform considerably worse when deployed on DeepStream TAO Toolkit	6	463	October 12, 2021
Apart from Deepstream where else I can deploy tlt-converted models or .trt engine files TAO Toolkit	5	1381	October 12, 2021

No detections after training PeopleNet using custom labeled data

Related topics