Training Trainable Peoplenet .tlt with Custom Human Images

I train the Peoplenet model with some images of humans lying down since the default one only detect people standing up. The results are bad (I used the detectnet_v2 Jupyter notebook, both train and retrain).

Do you have the train spec files for Peoplenet ( not the default detectnet_v2 specs )?
Does ordering of the class names in the train spec matter (person, bag, face vs face bag person)?

I did notice when setting load_graph to true on the train spec that it gives input size error, even though I have kept the number of classes the same.

It does not matter.

Your case is absolutely different from the training dataset of peoplenet. See the " TRAINING DATA" in PeopleNet | NVIDIA NGC .

So, the trainable peoplenet model is just a start point of your case. How many training images in your dataset and what is the resolution?

Not yet. And it should be similar to the default detectnet_v2 spec.
For load_graph, if set to true, it will load the graph from the pretrained model file and also the weights. The width and height should be exactly the same as peoplenet(960, 544)

  • If order does not matter, then does capitalization matter? Person vs person.

  • Around 7K images, 1920x1080

  • There was values for initial_weight, weight_target, class weight in the default detectnet_v2 train spec txt. All slightly different for pedestrian, car, etc. Are there any docs, with detailed explanation of these values or what these were for the default Peoplenet? The “Cost Function” section does not mention DetectNet_v2 — TAO Toolkit 3.22.05 documentation

Yes, it matters. Suggest you to run the baseline with the existing peoplenet. That means you can use the "tao evaluate xxx " to get the baseline result against your own dataset. Please refer to PeopleNet v1.0 unpruned model shows very bad results on COCO dataset - #11 by Morganh

Refer to Details on cost_function_config for PeopleNet - #3 by Morganh