YOLOV4 configs file

TensorRT Version7.2.1.6
Quadro RTX 5000 dual GPU
Driver Version: 455.23.05
CUDA Version: 11.1
Ubuntu 18.04
python 3.6

docker_registry: nvcr.io
docker_tag: v3.21.08-py3

I am training a custom yoloV4 model using transfer learning toolkit
I am facing few problems while building model

If you use your own dataset, you will need to run the code below to generate the best anchor shape

!tlt yolo_v4 kmeans -l $DATA_DOWNLOAD_DIR/training/label_2
-i $DATA_DOWNLOAD_DIR/training/image_2
-n 9
-x 1248
-y 384

The anchor shape generated by this script is sorted. Write the first 3 into small_anchor_shape in the config

file. Write middle 3 into mid_anchor_shape. Write last 3 into big_anchor_shape.

-x,-y are for shape of the image
here actually my data is of two different image shapes so how should i get the anchors

2.VehicleTypeNet | NVIDIA NGC

here in this documentation


RGB Images of dimensions: 224 X 224 X 3 (W x H x C) Channel Ordering of the Input: NCHW, where N = Batch Size, C = number of channels (3), H = Height of images (224), W = Width of the images (224) Input scale: None Mean subtraction: [103.939, 116.779, 123.68]

so should i reshape my data according to this shape 224 X 224 X 3?

3.yolov4 gives many options for pretrained model is resnet18 is vehical-net?

For your case, please check which model size you want to train. See output_width and output_height you set in the training spec. Then, resize the bboxes in all the labels files. Then run kmeans to get the anchors.

The vehicleTypeNet is a classification model. It is not related to yolov4 model. May I know why you ask VehicleTypenet?

I cannot understand. May I know what are the “options” . The pretrained models in ngc are not related to vehicles. They are trained via OpenImage dataset.

1 Like

So i am gonna build a yoloV4 detection model for vehicles with 12 classes ,i have around 11k data, using transfer learning toolkit .

the Jupiter notebook YoloV4 consists the following option to download the pre trained model to build

!ngc registry model list nvidia/tlt_pretrained_object_detection:*

  1. vgg19
  2. vgg16
  3. squeezenet
  4. resnet50
  5. resnet34
  6. resnet18
  7. resnet101
  8. resnet10
  9. mobilenetv2
  10. mobilenetv1
  11. googlenet
  12. efficientnet_b1_swish
  13. efficientnet_b1_relu
  14. efficientnet_b0_swish
  15. efficientnet_b0_relu
  16. Darknet53
  17. Darknet19
  18. cspdarknet53
  19. cspdarknet19

which pretrained model is best for my use case?

These are pretrained models for different backbones. Please consider the combination of mAP and fps. It depends on your requirement. You can have a try for resnet18.

1 Like


and what is the preferred amount of data required ?

There is no rule for the amount of data. More dataset will result in more training time. I think you can start with your 11k data.

1 Like

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.