Hi all,
I have been training Yolo V4 with the CSPDarknet53 backbone based on the sample Jupyter Notebook provided with TLT 3.0. I could get decent accuracy and I’m quite happy with the trained model in TLT.
In order to set the input dimensions of the network to (1024x768), I used the following augmentation_config
section in the TLT specification file:
augmentation_config {
hue: 0.1
saturation: 1.5
exposure:1.5
vertical_flip:0
horizontal_flip: 0.5
jitter: 0.3
output_width: 1024
output_height: 768
randomize_input_shape_period: 0
mosaic_prob: 0.5
mosaic_min_ratio:0.2
}
After the training and visualising the inference with TLT on some test images to assess the model, I exported an etlt
file using tlt-export
to use it with DeepStream 5.1.
Here is the export command:
!tlt yolo_v4 export -m $USER_EXPERIMENT_DIR/experiment_dir_unpruned_darknet53_nofreeze_norelu_dataok/weights/yolov4_cspdarknet53_epoch_$EPOCH.tlt \
-k $KEY \
-o $USER_EXPERIMENT_DIR/export/yolov4_cspdarknet53_epoch_$EPOCH_b1.etlt \
-e $SPECS_DIR/yolo_v4_retrain_resnet18_kitti.txt \
--batch_size 1 \
--data_type fp32
So in the DeepStream PGIE configuration file, in the property
section, I must use inference dims of 3x384x1248 otherwise the application is crashing (wrong dimensions).
[property]
gpu-id=0
offsets=103.939;116.779;123.68
net-scale-factor=1
#0=RGB, 1=BGR
model-color-format=1
tlt-encoded-model=../models/model.etlt
tlt-model-key=nvidia_tlt
labelfile-path=../models/labels.txt
infer-dims=3;384;1248
tlt-encoded-model=../models/model.etlt
tlt-model-key=<some_encoding>
labelfile-path=../models/labels.txt
infer-dims=3;384;1248
uff-input-order=0
uff-input-blob-name=Input
batch-size=1
## 0=FP32, 1=INT8, 2=FP16 mode
network-mode=2
network-type=0
num-detected-classes=15
is-classifier=0
maintain-aspect-ratio=0
output-blob-names=BatchedNMS
cluster-mode=3
parse-bbox-func-name=NvDsInferParseCustomBatchedNMSTLT
custom-lib-path=../lib/post_processor/libnvds_infercustomparser_tlt.so
3x384x1248 seems to be the default in TLT for Yolo V4, but I thought I changed that by updating the augmentation_config
section.
So how can I force DeepStream to use 3x1024x768 as inference dimension that has been used during the training?
As a side question, how can we choose the values for offsets
in the [property]
section?
Thanks,
Johan
It is not expected. For your case, the infer-dims needs to be 3;768;1024
It is the value of preprocessing. Please not change it.
That’s my problem: if I set it up to 3;768;1024 it crashes.
It seems that the network is stuck to input size of 3x384x1248.
Here is the message I got when I set infer-dims=3;768;1024
INFO: ../nvdsinfer/nvdsinfer_model_builder.cpp:685 [Implicit Engine Info]: layers num: 5
0 INPUT kFLOAT Input 3x384x1248
1 OUTPUT kINT32 BatchedNMS 0
2 OUTPUT kFLOAT BatchedNMS_1 200x4
3 OUTPUT kFLOAT BatchedNMS_2 200
4 OUTPUT kFLOAT BatchedNMS_3 200
....
ERROR: tlt/tlt_decode.cpp:274 failed to build network since parsing model errors.
ERROR: ../nvdsinfer/nvdsinfer_model_builder.cpp:797 Failed to create network using custom network creation function
ERROR: ../nvdsinfer/nvdsinfer_model_builder.cpp:862 Failed to get cuda engine from custom library API
0:00:04.171234278 19485 0x564d60618a10 ERROR nvinfer gstnvinfer.cpp:613:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Error in NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:1735> [UID = 1]: build engine file failed
Have you set tlt-encoded-model
to your trained model?
Deepstream will not stick to the input-size.
Suggest you to try run GitHub - NVIDIA-AI-IOT/deepstream_tao_apps: Sample apps to demonstrate how to deploy models trained with TAO on DeepStream , there are sample models which input-size is 960x544.
If run successfully, then replace with your trained model.
That’s what I understood as well.
I’ll give a try to the sample app.
Thanks,
Johan
Hi again,
I could run successfully the sample app with the input-size 960x544.
I plugged my model, and still had the same issue: DeepStream only runs if I use an input size of 3x384x1248. It crashes with 3x768x1024. I also tried 3x1024x768 just to be sure I did not swapped the dimensions.
Here is some of additional info: backbone is CSPDarknet53, all images have been resized to 1024x768 before training, and the data_augmentation section has been set as previously mentioned. The anchor shapes have also been generated using the 1024x768 size as well.
I just checked the training output, and the input layer has the correct size:
Layer (type) Output Shape Param # Connected to
==================================================================================================
Input (InputLayer) (None, 3, 768, 1024) 0
__________________________________________________________________________________________________
Thanks,
Johan
Can you double check the config file in deepstream? I can see some extra lines. For example, there are two tlt-encoded-model
The config file is correct (I made a mistake when copy and paste in the original post, hence the extra lines).
Here is the config file:
[property]
gpu-id=0
offsets=103.939;116.779;123.68
net-scale-factor=1
#0=RGB, 1=BGR
model-color-format=1
tlt-encoded-model=../models/model.etlt
tlt-model-key=nvidia_tlt
labelfile-path=../models/labels.txt
infer-dims=3;384;1248
#infer-dims=3;768;1024
uff-input-order=0
uff-input-blob-name=Input
batch-size=1
## 0=FP32, 1=INT8, 2=FP16 mode
network-mode=2
network-type=0
num-detected-classes=15
is-classifier=0
maintain-aspect-ratio=0
output-blob-names=BatchedNMS
cluster-mode=3
parse-bbox-func-name=NvDsInferParseCustomBatchedNMSTLT
custom-lib-path=../lib/post_processor/libnvds_infercustomparser_tlt.so
There is no other etlt file in the app directory.
I know it is very unlikely, but could the issue be from the tlt yolo_v4 export
step?
No, it will not.
Actually this is the first time I meet this kind of issue from users.
Some tips:
- Run tlt yolo_v4 evaluate , in the log, to check you indeed train a tlt model with 1024x768
- Run tlt-converter to generate trt engine based on your etlt model. And set this engine in config file of deepstream.
model-engine-file
= your-trt-engine
In this way, comment out tlt-encoded-model
and tlt-model-key
I solved my issue: it was actually in the export step.
But that was my mistake the path to the spec file was not correct.
So it is resolved :).
Thanks for the help and sorry for the inconvenience.
Cheers!
1 Like
Hi there!
Great Work! Would you be able to help me?
I have trained the yolo_v4 using TLT 3.0 using the sample notebook with the provided data. Now am trying to infer using the exported model in DeepStream 5.1. I am having a hard time in understand the config file. Due to that am unable to perform Inference and face errors. Would you be kind enough to provide the simplest config file for a yolo_v4 model trained on the provided data?
Thank you