I train my custom dataset and use DetectnetV2 as my detection model by using Transfer Learning Toolkit. My custom dataset is car’s plateNumber Images . I rsized all Image to the same size (h, w)= (288, 672).
Some classes’s accuracy is 0, I think that some class is hard to classify. As a result, I use more depth model to train.
I have use model detectnetv2 + resnet18, detectnetv2 + resnet50, but two results are not good.
Can you give me some training suggestion?
You have trained a 672x288 model.
I have one question for your evaluation log.
Why it is saying 224x96? Can you double check if you run tlt-evaluate with the 672x288 model?
Layer (type) Output Shape Param
input_1 (InputLayer) (None, 3, 96, 224) 0
Sorry, I put the misMatch experiment files. That’s one of the experiments. These training files and evaluation files are matches.
detectnet_v2_train_resnet18_kitti.txt (35.6 KB) evaluation_288_672.txt (47.7 KB)
Could you please refer to similar topic DetectNet v2 18 Layers for Character Recognition (35 Classes) ? That end user trained 35 classes with detectnet_v2.
I compute the class_weight and retrain my experiment, but the result still not good. I try to increase L1 weight_decay, but the map will become 0.
Is detectnet_v2 not good for unbalanced data or close bboxes in images?
for i in range(len(class_count_array)):
class_weight_array[i]= 1.0 / (class_count_array[i] / max_class_count)
I use yolov3 of TLT and get better result about map(92.0).
I use yolov3 detector(word detector) as secondary inference engine in deepstream. When the one special plate is close to image boundary, my word detector will crash.
0:06:05.745221257 5732 0x17aa8280 WARN nvinfer gstnvinfer.cpp:1240:convert_batch_and_push_to_input_thread:<secondary_gie_0> error: NvBufSurfTransform failed with error -2 while converting buffer
ERROR from secondary_gie_0: NvBufSurfTransform failed with error -2 while converting buffer
Debug info: /dvs/git/dirty/git-master_linux/deepstream/sdk/src/gst-plugins/gst-nvinfer/gstnvinfer.cpp(1240): convert_batch_and_push_to_input_thread (): /GstPipeline:pipeline/GstBin:secondary_gie_bin/GstNvInfer:secondary_gie_0
How about the average width/height of the bboxes in your training labels?
You can compare more details with previous link I shared. That end user can get a high mAP on 35 classes.
For yolo_v3, yes, it trains well for lots of classes.
For your new error “When the one special plate is close to image boundary, my word detector will crash.”, is it a corner case or 100% reproduced?
I can reproduce it, but not every corner plate will crash. After I increase input-object-min-width from 32 to 50, it works well. I don’t understand why error occur, because the RoI image should be resized as word detector’s input shape. It should not have any error.
For your previously mentioned error, please run a comparable experiment with the tool tlt-infer. Let tlt-infer run inference against one special plate which is close to image boundary. To check if tlt-infer can run well. If tlt-infer runs well, then the error occurs at deepstream inference. Please check your ds config file or search some help from ds forum.