TLT detectnet_v2 set training width and height

neuroSparK · December 9, 2020, 5:04am

I have trained a detectnet_v2 model with KITTI formatted dataset which shows avg precision of 58% during evaluation on TLT. But while running in video, it can’t detect any single object properly. Here is the training config file: resnet_train.txt (3.0 KB)
My input images are 1280x720 resolution so where should I set the width-height parameters in the training config file?

Morganh · December 9, 2020, 5:58am

When you said above, did you mean you are using deepstream to run inference? If yes, please share the config file of deepstream.

neuroSparK · December 9, 2020, 3:55pm

Yes deepstream-app has been used for inferencing. Here is the pgie-config file:

[property]
gpu-id=0
net-scale-factor=0.0039215697906911373
model-color-format=0
labelfile-path=detectnet_v2_labels.txt
tlt-encoded-model=resnet18_detector.etlt
tlt-model-key=ZHBmNTA4cHRkNDZwM2****************S00NTZjLTlhOWYtMzI3N2U0ODBiMWU1
#infer-dims=3;544;960
infer-dims=3;720;1280
uff-input-order=0
uff-input-blob-name=input_1
output-blob-names=output_cov/Sigmoid;output_bbox/BiasAdd
batch-size=1
## 0=FP32, 1=INT8, 2=FP16 mode
network-mode=2
num-detected-classes=1
interval=0
gie-unique-id=1
is-classifier=0

[class-attrs-all]
pre-cluster-threshold=0.05
group-threshold=1
eps=0.2
roi-top-offset=0
roi-bottom-offset=0
detected-min-w=0
detected-min-h=0
detected-max-w=0
detected-max-h=0

Morganh · December 9, 2020, 3:58pm

How about the tlt-infer result? Can it detect well?

Morganh · December 9, 2020, 4:00pm

Also, please share the screenshot when you run inference with deepstream.

neuroSparK · December 9, 2020, 4:31pm

Here is the result when running tlt-infer command although the evaluation shows 58% avg. precision. The model is trained to detect vehicle license plates. Note that the training images contain the green bboxes which has been dumped by running fd_lpd.caffemodel by deepstream-app and the red bboxes are infer result.

neuroSparK · December 9, 2020, 4:42pm

Here’s a sample training image and corresponding label

Labelfile.txt

license_plate 0 0 0 0 508.133331 58.074078 531.066650 71.555557 0 0 0 0 0 0

Morganh · December 9, 2020, 5:23pm

According to your figure 1, the tlt-infer’s result is also not good.
So, could you please try to run tlt-infer against more images? Then, calculate the average precision.
If it is similar to 58%, I am afraid you need to trigger more experiments to improve the training mAP.

neuroSparK · December 9, 2020, 5:33pm

All of the tlt-infer results bbox left coordinate is on 0 in frame. Here’s some infer images in the zip file
tlt-infer.zip (2.0 MB)

So my question is, IS there any need to define image width-height in the training spec file if the training image size is different from the standard KITTI size (i.e. 1248x384) ?

Morganh · December 9, 2020, 5:38pm

Hey, please note that if you want to train a 1248x384 detectnet_v2 model, you need to resize all the images and labels to 1248x384 offline. If you want to train a 1280x720 detectnet_v2 model, you need to resize all the images and labels to 1280x720 offline.

See Integrating TAO Models into DeepStream — TAO Toolkit 3.22.05 documentation

neuroSparK · December 10, 2020, 6:45am

All of my training images are 1280x720 in size. In the label file, all the 0 fields are int but as per documentation, I see some should be float. Could that be a problem?

Morganh · December 10, 2020, 6:47am

Suggest to follow the format mentioned in Integrating TAO Models into DeepStream — TAO Toolkit 3.22.05 documentation
The bbox coordinate value are also float.
For example,
cyclist 0.00 0 0.00 665.45 160.00 717.93 217.99 0.00 0.00 0.00 0.00 0.00 0.00 0.00

Morganh · December 14, 2020, 5:02am

@neuroSparK
Any update, is the issue fixed on your side?
BTW, how about the average sizes of bbox in LPs? Are they too small?

neuroSparK · December 14, 2020, 5:49am

The issue is not fixed yet. Couldn’t find a way or figure out whats going wrong. Anyways, is there any way to find the pretrained model for fd_lpd.caffemodel found in the redaction example?

Morganh · December 14, 2020, 6:29am

Hi @neuroSparK,
Could we focus on your tlt-infer incorrect result firstly?
I want to figure out why you get wrong result with tlt-infer.
If possible, could you share the full training log? The latest training spec is also appreciated.

neuroSparK · December 22, 2020, 5:41pm

I have found a little formatting error on the label fields. I will retrain with correction and then check again.

Topic		Replies	Views
Finding inaccurate result while testing model(TLT trained model) with deepstream TAO Toolkit	14	1027	October 12, 2021
The engine trained and deployed using TLT runs incorrectly in Deepstream TAO Toolkit	13	558	October 12, 2021
Getting erroneous detection with TLT trained model deployed while testing with deepstream DeepStream SDK	5	960	October 12, 2021
Detectnet_v2, tlt inference error TAO Toolkit	10	414	October 12, 2021
Incorrect bounding box of detectnet_v2-darknet-53 in the inference phase TAO Toolkit	10	696	October 12, 2021
No detections after training PeopleNet using custom labeled data TAO Toolkit	7	867	October 12, 2021
TLT trained model accuracy worse after deployment TAO Toolkit	11	834	October 12, 2021
DetectNet V2 TAO 5.5 average_precision very low or zero TAO Toolkit	11	56	January 22, 2025
BBox is a little off to the upper left when you run it on the DS TAO Toolkit	5	459	October 12, 2021
Tlt-infer - input image TAO Toolkit	3	608	October 12, 2021

TLT detectnet_v2 set training width and height

Related topics