YOLO v3 Training


i’m trying to train the YOLO v3 with tlt
i use 1080x720 input image for train

but i know the yolov3 input was 608 or 416
can i train with 1080x720 images?
and should i change the big_anchor_shape,mid_anchor_shape,… in train config file related to the input image size??

and i wonder which size should i use when i import my custom yolo v3 model to the deepstream
should i reszie to 608? or 1080x720 to the nvinfer?

See yolo_v3 in the tlt user guide. In the trainign, you can set to 1088x736 or 1088x704, etc.


  • Input size : C * W * H (where C = 1 or 3, W >= 128, H >= 128, W, H are multiples of 32)
  • Image format : JPG, JPEG, PNG
  • Label format : KITTI detection

In the deployment via deepsteam, only need to set the same size of that in your training.

Thanks for reply

actually i made my train images to multiplys 32

and i train

but my train result very strange

firt train
class : car, person
car= 60
person =14

second train
class : person
person =14

i used google free images, cal_tech person images, coco person images
for my train
what should i do if i want to get better results?

Please share your full training log and training spec file.

here is my training log and spec file
training log file was big
so i split the file 1,2

spec.txt (1.6 KB)
train_log2.txt (2.8 MB) train_log1.txt (2.9 MB)

Firstly, please check if your spec.txt is exactly the spec file during the training. |'m asking this because you set below in the spec
output_image_width: 1280
output_image_height: 960

but below training log just shows that you set 608x608.

Input (InputLayer) (64, 3, 608, 608) 0


  1. If you want to train a 608x608 model, please double check you have already resized your images/labels to 608x608 offline.
  2. Run kmeans.py to get the proper anchors_shapes against the label files.
  3. Finetune the hyper-parameters. For example, set lower bs, finetune max_learning_rate.
  4. Also,you can run small part of training dataset to finetune the hyper-parameters.
  5. Try 1gpu or 2gpu

Thanks for reply

where can i get the kmeans.py??

btw why should i use only 1 or 2 gpus?

The kmeans.py is inside the tlt docker.
For 1gpu or 2gpus, just for comparative experiments.