PeopleNet Dataset Training Images resolution and number of images required

Hello,

I want to retrain PeopleNet using Transfer Learning Toolkit and prune it to use on Nvidia Jetson Nano.

  1. Do I have to keep the dimensions of all training images same? I have dataset containing images of different resolutions. What would you recommend in that case?

  2. How many images do you think are required for retraining? I am using 100 images will that be fine?

  3. Can you recommend some annotation tool? I used LabelIMG previously but it does not give the ouptut in kitti format.

  4. How to train false positives? In my test, I saw some dogs, cows etc were detected as “person”. There are 3 classes person, face and bag. Should I add new “other” class? What should I do?

Thanks for the help.

  1. Yes, it is needed. Need to resize labels/images to final training size. See tlt user guide below.
    For peoplenet, you can set to 960x544

DetectNet_v2

  • Input size: C * W * H (where C = 1 or 3, W > =480, H >=272 and W, H are multiples of 16)
  • Image format: JPG, JPEG, PNG
  • Label format: KITTI detection

Note: The tlt-train tool does not support training on images of multiple resolutions, or resizing images during training. All of the images must be resized offline to the final training size and the corresponding bounding boxes must be scaled accordingly.

  1. It is difficult to draw a conclusion how many images are needed for each class. More is better, but it will cost more training time. More, if training result is not good, normally we will consider increasing images.

  2. You can search the annotation tool. If its output is not kitti format file, please write scripts to convert to KITTI format. In TLT, the label file only cares about class_name, x1,y1,x2,y2.
    More info in Is it possible to generate .tfrecorfs for tlt training directly without using intermediate kitti format?

  3. End users can add class or delete class. More info in
    People Net -
    PeopleNet v1.0 unpruned model shows very bad results on COCO dataset
    Peoplenet Inference

1 Like