Tlt resnet18 performance drop between .tlt inference and .engine

benedetta.delfino · July 20, 2021, 4:22pm

• Jetson NX
• Classification Resnet18
• TLT Version 3.0
• Training spec file:
model_config {
arch: “resnet”
n_layers: 18
use_batch_norm: true
all_projections: true
freeze_blocks: 0
freeze_blocks: 1
input_image_size: “3,144,256”
}
train_config {
train_dataset_path: “/workspace/tlt-experiments/data/train”
val_dataset_path: “/workspace/tlt-experiments/data/val”
pretrained_model_path: “/workspace/tlt-experiments/results/resnet_022_PRUNED.tlt”
optimizer {
sgd {
lr: 0.01
decay: 0.0
momentum: 0.9
nesterov: False
}
}
batch_size_per_gpu: 64
n_epochs: 200
n_workers: 16
preprocess_mode: “caffe”
enable_random_crop: True
enable_center_crop: True
label_smoothing: 0.0
mixup_alpha: 0.1
reg_config {
type: “L2”
scope: “Conv2D,Dense”
weight_decay: 0.00005
}
lr_config {
step {
learning_rate: 0.006
step_size: 10
gamma: 0.1
}
}
}
eval_config {
eval_dataset_path: “/workspace/tlt-experiments/data/test”
model_path: “/workspace/tlt-experiments/results/weights/resnet_074.tlt”
top_k: 3
batch_size: 256
n_workers: 8
enable_center_crop: True
}

Hello,

I trained a classification model using tlt to be used as a secondary model in deepstream. I followed the documentation and get 97% accuracy on my test set. I then used tlt-converter to make an engine file (for nx) and when I run this model I only get about 20% accuracy (basically I think the model is just guessing). Also, interestingly, my confidences are 0.8 and above with the .tlt but they drop to around 0.4 with the engine.

For the training data I cropped out the objects of interest from the images and trained the model on the cropouts. I did not resize or pad the images as the dataloader should take care of this?

I think that the problem is that when the engine is used as a secondary classifier the images it sees are very different from the ones the model was trained on. To debug this, can you give me more details on:

I am training with preprocess mode “caffe”, what does this do? Do I need to set anything special in deepstream to emulate it? Does it work in RGB or BGR? Would this make a difference?
there is a parameter called “offsets” = Array of mean values of color components to be subtracted from each pixel in deepstream. How do I find out what the mean subtraction values should be? Does this postprocessing step of the cropouts impact what the images at training should look like? What are the default values for mean subtraction at training time?
in what order are the cropouts postprocessed by deepstream by the pgie? crop → mean subtraction → normalization → biliear resize → padding (only bottom right)? What colour is the padding? Is the color consistent in tlt training dataloader and in deepstream?

Morganh · July 21, 2021, 3:16am

For TLT classification model inference, there are 3 methods.

1st is : tlt classification inference. You already mention that it is running well.

2nd is : standalone python inference. I made some modification based on one customer’s code. See Inferring resnet18 classification etlt model with python . This end user can get the same result as tlt infer

3rd is: run inference with deepstream. Please see the solution (comment 21, 24, 32) in Issue with image classification tutorial and testing with deepstream-app

  Main change:
 -	Please the offset to 103.939;116.779;123.68
 -	Generate avi file with gstreamer
 -	Set “scaling-filter=5”

benedetta.delfino · July 21, 2021, 12:03pm

ok I will try the suggestions above thank you. I trained the tlt model with RGB images so I am using model-color-format=0. Is this correct? Also setting offset to 103.939;116.779;123.68 with RGB?

Thank you

Morganh · July 21, 2021, 1:59pm

Please set to BGR configuration.
model-color-format=1

Also please set to below
offsets=103.939;116.779;123.68

Reference: comment 21 of Issue with image classification tutorial and testing with deepstream-app - #21 by Morganh

benedetta.delfino · July 22, 2021, 8:49am

those parameters helped a bit but model performance is still not quite what it should be. The resnet I trained has dimensions 3,144,256 (c,h,w). Do I need to set maintain-aspect-ratio=1?

Morganh · July 22, 2021, 8:52am

Not needed.

Do you ever set

Set “scaling-filter=5”

and if possible, please

Generate avi file with gstreamer

Please share your latest config file.

benedetta.delfino · July 22, 2021, 9:32am

this is my latest config file for the secondary model:
config.txt (793 Bytes)

Sorry what do you mean by avi file? I can convert the mp4 video of inference with tracker bboxes to avi? Or did you mean something else?

Morganh · July 22, 2021, 9:50am

See Issue with image classification tutorial and testing with deepstream-app - #24 by Morganh

gst-launch-1.0 multifilesrc location=“/tmp/%d.jpg” caps=“image/jpeg,framerate=30/1” ! jpegdec ! x264enc ! avimux ! filesink location=“out.avi”

The avi file is better than mp4 file for inference.

Morganh · July 22, 2021, 10:10am

More, in other topic mentioned above, the end user can run inference well with TLT classification model. So, please refer to the config file https://forums.developer.nvidia.com/uploads/short-url/rk4x7xqir6N1nl3QpfxBcTTE6FA.txt in Issue with image classification tutorial and testing with deepstream-app - #21 by Morganh to narrow down.
For example, process-mode=1 etc.

benedetta.delfino · July 22, 2021, 11:40am

sorry can you describe the process to make the avi file? I can run the command you sent but do I need a folder of images (cropouts)?

Morganh · July 22, 2021, 11:55am

Yes, for below way, it will generate avi file from jpg files.
gst-launch-1.0 multifilesrc location=“/tmp/%d.jpg” caps=“image/jpeg,framerate=30/1” ! jpegdec ! x264enc ! avimux ! filesink location=“out.avi”

benedetta.delfino · July 22, 2021, 4:00pm

ok I will try this. My cropouts are all different sizes though- am I supposed to resize and pad them to the same size to make the .avi? If so how? bilinear interpolation and then pad bottom right?

Morganh · July 22, 2021, 4:05pm

You can generate jpg files via ffmpeg.
$ ffmpeg -i xxx.mp4 folder/%d.jpg

It is not needed to resize/pad.

benedetta.delfino · July 22, 2021, 4:13pm

what is this .avi file for? I already have a video to run inference on. I thought the idea was to make an .avi video consisting of cropouts so that I can run classification as a primary model?

Also- after primary inference the cropouts that are sent to the secondary model- what are they supposed to look like? Resized and padded bottom right?

Morganh · July 22, 2021, 4:16pm

If you already have avi file, please directly use it. I thought you have mp4 file only.
Can you follow Issue with image classification tutorial and testing with deepstream-app - #21 by Morganh to run inference with only one GIE. In this case, there is not 2nd gie.

benedetta.delfino · July 27, 2021, 9:22am

hello, I ran the classification resnet18 tlt on the avi file (as the only GIE) and I get bad performance. What do you think could be the issue?

Also what are the processing steps between the primary and secondary GIE normally? Can you describe what happens with the bbox cropouts please?

Thanks

Morganh · July 27, 2021, 9:33am

Firstly, please make sure tlt classification inference can run well. Please double check, and try to run more test images. If it is good, that means your tlt model can run inference well against the test image.

Then you can export this tlt model to etlt model, and run inference with this etlt model in deepstream. As we synced above, pay attention to the config file. You can just use primary GIE only. It will detect the whole test image (process-mode=1) . In this case, it did not crop bboxes.

benedetta.delfino · July 27, 2021, 3:46pm

when I try to run the .etlt model I get this error:
Linking elements in the Pipeline

linking recording pipeline
Opening in BLOCKING MODE
Opening in BLOCKING MODE
Opening in BLOCKING MODE
Opening in BLOCKING MODE
Starting pipeline

Opening in BLOCKING MODE
Opening in BLOCKING MODE
Opening in BLOCKING MODE
Opening in BLOCKING MODE
gstnvtracker: Loading low-level lib at /opt/nvidia/deepstream/deepstream/lib/libnvds_nvdcf.so
gstnvtracker: Batch processing is ON
gstnvtracker: Past frame output is ON
[NvDCF][Warning] minTrackingConfidenceDuringInactive is deprecated
[NvDCF] Initialized
0:00:01.951649466 21660 0x558f83a950 INFO nvinfer gstnvinfer.cpp:619:gst_nvinfer_logger: NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:1716> [UID = 1]: Trying to create engine from model files
ERROR: Uff input blob name is empty
ERROR: Failed to create network using custom network creation function
ERROR: Failed to get cuda engine from custom library API
0:00:03.594655486 21660 0x558f83a950 ERROR nvinfer gstnvinfer.cpp:613:gst_nvinfer_logger: NvDsInferContext[UID 1]: Error in NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:1736> [UID = 1]: build engine file failed
Segmentation fault (core dumped)

Morganh · July 28, 2021, 3:02am

Can you share the full command, config files?

benedetta.delfino · July 28, 2021, 10:12am

I can confirm that tlt classification inference runs well.
I managed to run deepstream with the .etlt model but the performance is not good on my test video. What could be the reason for this? My config is as follows:
[property]

gpu-id=0

net-scale-factor=1.0

model-color-format=1

offsets=103.939;116.779;123.68

num-detected-classes=13

output-blob-names=predictions/Softmax

#model-engine-file=path/to/engine

tlt-encoded-model=path/to/etlt

tlt-model-key=mykey

labelfile-path=path/to/labels

network-mode=2

process-mode=1

gie-unique-id=1

operate-on-gie-id=1

classifier-async-mode=0

classifier-threshold=0.1

interval=0

batch-size=16

scaling-filter=5

network-type=1

workspace-size=4096

infer-dims=3;144;256

maintain-aspect-ratio=0

enable-dla=1

use-dla-core=0

uff-input-blob-name=input_1

[class-attrs-all]

Topic		Replies	Views
Issue with image classification tutorial and testing with deepstream-app TAO Toolkit tensorrt , jetson-inference	34	5779	October 12, 2021
Inferring resnet18 classification etlt model with python TAO Toolkit	45	3985	October 12, 2021
Tlt-infer detectnet_v2 fails - TypeError TAO Toolkit	37	1402	October 12, 2021
Little to no detection on Deepstream-App compared to TLT's infer using the same model TAO Toolkit	6	628	October 12, 2021
Issue when a simple classification model deployed with Deepstream 5.0 Jetpack 4.4 TAO Toolkit	10	667	October 12, 2021
How to do inference with a TLT faster rcnn model? TAO Toolkit	15	1694	October 12, 2021
FaceDetect IR Training using TLT 3.0 and Custom Dataset TAO Toolkit tensorrt , ai-training , deep-learning	13	1601	October 12, 2021
Fpenet retraining output file onnx but deepstream is using tlt TAO Toolkit	22	864	October 17, 2023
TLT trained model accuracy worse after deployment TAO Toolkit	11	830	October 12, 2021
Classification inference huge performance degradation TAO Toolkit	11	1524	February 18, 2022

Tlt resnet18 performance drop between .tlt inference and .engine

Related topics