Issues with tao classifier_tf2 in deepstream (Accuracy drops)

cristian.machuca.mendoza · September 4, 2024, 6:45am

I have fine tuned the classification_tf2 network from tao toolkit. I can generate the trt engine normally and when I evaluate it the results are satisfcatory. However, when I use the model in deepstream the results differ to the ones i got running inference in tao deploy . I am aware the engine must be generated according to the GPU, that is why i generate a new engine in deepstream.
This is my config files in deepstream 6.3:
The deepstream app:

[application]
enable-perf-measurement=1
perf-measurement-interval-sec=5

[tiled-display]
enable=1
rows=1
columns=1

width=640
height=360
gpu-id=0

nvbuf-memory-type=0

[source0]
enable=1
type=3
uri=file:///root/top/Downloads/vid10_36.mkv

num-sources=1
gpu-id=0
cudadec-memtype=0

[sink0]
enable=1
type=2
sync=0
source-id=0
gpu-id=0
nvbuf-memory-type=0

[osd]
enable=1
gpu-id=0
border-width=1
text-size=15
text-color=1;1;1;1;
text-bg-color=0.3;0.3;0.3;1
font=Serif
show-clock=0
clock-x-offset=800
clock-y-offset=820
clock-text-size=12
clock-color=1;0;0;0
nvbuf-memory-type=0

[streammux]
gpu-id=0
live-source=0
batch-size=1
batched-push-timeout=40000
width=2688
height=1520
enable-padding=1
nvbuf-memory-type=0

[primary-gie]
enable=1
gpu-id=0
batch-size=1
bbox-border-color0=1;0;0;1
bbox-border-color1=0;1;1;1
bbox-border-color2=0;0;1;1
bbox-border-color3=0;1;0;1
interval=0
gie-unique-id=1
nvbuf-memory-type=0
config-file=config.txt

#The config file:

[property]
gpu-id=0

net-scale-factor=0.017507
offsets=123.675;116.280;103.53
model-color-format=0
batch-size= 1
onnx-file=/root/top/experiments_final3/efficientnet-b0_013.onnx
labelfile-path=/root/top/experiments_final3/labels.txt
model-engine-file=/root/top/experiments_final3/efficientnet-b0_013.onnx_b1_gpu0_int8.engine
int8-calib-file=/root/top/experiments_final3/cal.bin
infer-dims=3;256;256
uff-input-blob-name=input_1
output-blob-names=Identity:0
process-mode=1
network-mode=1
network-type=1
num-detected-classes=3
interval=0
gie-unique-id=1
classifier-async-mode=1
classifier-threshold=0.2

##The training spec file:
dataset:
train_dataset_path: “/home/data/train”
val_dataset_path: “/home/data/val”
preprocess_mode: ‘torch’
num_classes: 3
augmentation:
enable_center_crop: False
enable_random_crop: False
disable_horizontal_flip: True
enable_color_augmentation: False
mixup_alpha: 0
train:
qat: False
checkpoint: ‘’
batch_size_per_gpu: 32
num_epochs: 200
optim_config:
optimizer: ‘sgd’
lr_config:
scheduler: ‘cosine’
learning_rate: 0.0005
soft_start: 0.05
reg_config:
type: ‘L2’
scope: [‘conv2d’, ‘dense’]
weight_decay: 0.00005
results_dir: ‘/home/experiments_final_3/train’
model:
backbone: ‘efficientnet-b0’
input_width: 256
input_height: 256
input_channels: 3
evaluate:
dataset_path: “/home/data/test”
checkpoint: “/home/experiments_final_3/train/efficientnet-b0_010.tlt”
top_k: 1
batch_size: 16
n_workers: 8
results_dir: ‘/home/machukamendosk/experiments_final_3/evaluation’
export:
checkpoint: “/home/experiments_final_3/train/efficientnet-b0_013.tlt”
onnx_file: ‘/home/experiments_final_3/export/efficientnet-b0_013.onnx’
results_dir: ‘/home/experiments_final_3/export’
inference:
checkpoint: ‘’
trt_engine: ‘/home/experiments_final_3/export/efficientnet-b0_013.int8.engine’
image_dir: ‘/home/data/inference1’
classmap: ‘/home/experiments_final_3/train/classmap.json’
results_dir: ‘/home/experiments_final_3/inference1’
gen_trt_engine:
onnx_file: ‘/home/experiments_final_3/export/efficientnet-b0_013.onnx’
trt_engine: ‘/home/experiments_final_3/export/efficientnet-b0_013.int8.engine’
results_dir: ‘/home/experiments_final_3/export’
tensorrt:
data_type: “int8”
max_workspace_size: 4
max_batch_size: 16
calibration:
cal_image_dir: ‘/home/data/val’
cal_data_file: ‘/home/experiments_final_3/export/calib.tensorfile’
cal_cache_file: ‘/home/experiments_final_3/export/cal.bin’
cal_batches: 20

I see the preprocessing is pretty important. But i can not fugure it out what is wrong.

cristian.machuca.mendoza · September 4, 2024, 7:19am

I would like to add that i am using tao-toolkit:5.3.0-deploy for engine genration and inference and tao-toolkit:5.0.0-tf2.11.0 for training

Morganh · September 4, 2024, 8:00am

Please set to below and retry.

net-scale-factor=0.0175070028011204
offsets=123.675;116.28;103.53

Because in deepstream, y = scale_factor * (x- mean) .
In tao-tf2 torch mode, y = (x/255 - torch_mean)/std = (x - 255 * torch_mean) * (1/255/std) .
So, scale_factor = 1/255/std.
The mean = 255 * torch_mean

cristian.machuca.mendoza · September 4, 2024, 8:21am

I retried but the results haven’t changed. In inference and deepstream the images belong to different classes. It is quite strange.

Morganh · September 4, 2024, 8:26am

For the same frame, if tao-deploy inference to A class, deepstream inference to B class, you can check the label.txt file to check the order.

cristian.machuca.mendoza · September 4, 2024, 8:30am

The label.txt file is correct. I mean not always belong to different classes. There are frames that are classified properly, but a significant number of them are not classified as it should be. (In inference the classification is better than in deepstream)

cristian.machuca.mendoza · September 4, 2024, 8:32am

I have ensured that the preprocessing is the same, the labels are in correct order, the sizes are correct.

Morganh · September 4, 2024, 8:34am

Please run fp32 mode to check if it is the same as tao-deploy.
More info can be found in Fine-tuned TAO ClassificationTF2 Accuracy Drop after Compiling to TensorRT - #31 by Morganh as well.

cristian.machuca.mendoza · September 4, 2024, 9:09am

I have read that topic during the last days, and I see that preprocessing plays a big role. I ran fp32 mode as well, and the results are still not satisfactory. The calssification performance in deepstream is still poor compared to the classificaiton performance in tao deploy (inference). I have tried different preprocessing parameters but I can not figure it outwhy the accuracy drops.

Morganh · September 4, 2024, 9:17am

To narrow down, please use below trtexec command to generate fp32 engine inside tao-deploy docker and deepstream docker.

trtexec --onnx=/path/to/model.onnx \
        --maxShapes=input_1:1x3x256x256 \
        --minShapes=input_1:1x3x256x256 \
        --optShapes=input_1:1x3x256x256 \
        --saveEngine=fp32.engine

In tao-deploy docker, run evaluation against it.
In deepstream docker, not let deepstream generate engine, just to make sure deepstream use the engine you generated via trtexec.
And compare again.

More experiment is that, please generate mp4 file instead.

cristian.machuca.mendoza · September 4, 2024, 10:12am

i have followed your intructions. First when I use trtexec in tao-deploy docker, when i run inference the results are quite good. After that i use trtexec in deepstream, during inference in deepstream the results are not satisfactory. I tried to use the engine generated in tao-deploy but deepstream shows “deserialize backend context from engine from file :/root/top/Tao_deploy_here/efficientnet-b0_013_1.fp32.engine failed, try rebuild” and rebuilds a new engine. I try to run inference in tao-deploy using the engine generated in deepstream but i get this error: AttributeError: ‘NoneType’ object has no attribute ‘create_execution_context’

Morganh · September 4, 2024, 10:20am

More experiments here.
exp1:
Please generate .avi file and .mp4 file and retry.
$ gst-launch-1.0 multifilesrc location=“/tmp/%d.jpg” caps=“image/jpeg,framerate=30/1” ! jpegdec ! x264enc ! avimux ! filesink location=“out.avi”

$ apt-get install ffmpeg
$ ffmpeg -framerate 2 -pattern_type glob -i ‘*.jpg’ -c:v libx264 -pix_fmt yuv420p -vf “crop=trunc(iw/2)*2:trunc(ih/2)*2” out.mp4

exp2:
Please refer to the config_as_primary_gie.txt in Fine-tuned TAO ClassificationTF2 Accuracy Drop after Compiling to TensorRT - #31 by Morganh.

cristian.machuca.mendoza · September 4, 2024, 10:59am

config.txt (940 Bytes)
deepstream_app.txt (2.4 KB)
These are my config files in deepstream. I have converted the video as well. The results continue to be unsatisfactory. I ahve tried different parameters for example: maintain-aspect-ratio=0 and 1. But it does not work out. I would like to add that i ahve followed all the recomendations given here: Fine-tuned TAO ClassificationTF2 Accuracy Drop after Compiling to TensorRT - TAO Toolkit - NVIDIA Developer Forums . In the spec file I have not used augmentation as recommended.

Morganh · September 5, 2024, 5:06am

How did you train the model? Did the train use center_crop?

cristian.machuca.mendoza · September 5, 2024, 5:10am

Here is the training spec file. i did no use center crop:

dataset:
train_dataset_path: “/home/data/train”
val_dataset_path: “/home/data/val”
preprocess_mode: ‘torch’
num_classes: 3
augmentation:
enable_center_crop: False
enable_random_crop: False
disable_horizontal_flip: True
enable_color_augmentation: False
mixup_alpha: 0
train:
qat: False
checkpoint: ‘’
batch_size_per_gpu: 32
num_epochs: 200
optim_config:
optimizer: ‘sgd’
lr_config:
scheduler: ‘cosine’
learning_rate: 0.0005
soft_start: 0.05
reg_config:
type: ‘L2’
scope: [‘conv2d’, ‘dense’]
weight_decay: 0.00005
results_dir: ‘/home/experiments_final_3/train’

model:
backbone: ‘efficientnet-b0’
input_width: 256
input_height: 256
input_channels: 3

evaluate:
dataset_path: “/home/data/test”
checkpoint: “/home/train/efficientnet-b0_010.tlt”
top_k: 1
batch_size: 16
n_workers: 8
results_dir: ‘/home/experiments_final_3/evaluation’

export:
checkpoint: “/home/train/efficientnet-b0_013.tlt”
onnx_file: ‘/home/export/efficientnet-b0_013.onnx’
results_dir: ‘/home/experiments_final_3/export’

inference:
checkpoint: ‘’
trt_engine: ‘/home/export/efficientnet-b0_013.int8.engine’
image_dir: ‘/home/data/inference1’
classmap: ‘/home/experiments_final_3/train/classmap.json’
results_dir: ‘/home/experiments_final_3/inference1’

gen_trt_engine:
onnx_file: ‘/home/experiments_final_3/export/efficientnet-b0_013.onnx’
trt_engine: ‘/home/export/efficientnet-b0_013.int8.engine’
results_dir: ‘/home/experiments_final_3/export’
tensorrt:
data_type: “int8”
max_workspace_size: 4
max_batch_size: 16
calibration:
cal_image_dir: ‘/home/data/val’
cal_data_file: ‘/home/experiments_final_3/export/calib.tensorfile’
cal_cache_file: ‘/home/experiments_final_3/export/cal.bin’
cal_batches: 20

cristian.machuca.mendoza · September 5, 2024, 10:24am

@Morganh I have the same issue even using classifcation_tf1. In deepstream the performance is reduced.

Morganh · September 5, 2024, 2:17pm

Classification_tf2 should have no issue. May I know which deepstream docker you are running?

cristian.machuca.mendoza · September 5, 2024, 2:24pm

I tried in 7.0-samples-multiarch, 7.0-triton-multiarch, and 6.3-samples. In the three i have the same issue. Actually i tried with other data in case my dataset was problematic (i trained using the dataset cats and dogs), and also got erros. Inference in tao deploy 5.5.0 is more accurate than in deepstream. These are all my files:

config_file_deepstream.txt (1.1 KB)
deepstream_app.txt (2.5 KB)
engine.txt (279 Bytes)
evaluate_engine.txt (274 Bytes)
export.txt (273 Bytes)
inference.txt (275 Bytes)
labels.txt (7 Bytes)
train.txt (272 Bytes)
train_spec.txt (2.2 KB)

Morganh · September 5, 2024, 2:39pm

I suggest you to check if the preprocessing is the same between tao-deploy and deepstream.
In tao-deploy, you can add debug code in tao_deploy/nvidia_tao_deploy/cv/classification_tf2/scripts/inference.py at 31c7e0ed3fe48942c254b3b85517e7418eea17b3 · NVIDIA/tao_deploy · GitHub and tao_deploy/nvidia_tao_deploy/cv/classification_tf1/dataloader.py at 31c7e0ed3fe48942c254b3b85517e7418eea17b3 · NVIDIA/tao_deploy · GitHub to save images after preprocessing. Similar to Fine-tuned TAO ClassificationTF2 Accuracy Drop after Compiling to TensorRT - #35 by Morganh. This will help you understand the preprocessing based on your training spec file.
Also, you can leverage the tao-deploy code to generate a standalone inference code to run inference against the tensorrt engine.
Check if your standalone inference can get the expected result.

cristian.machuca.mendoza · September 6, 2024, 3:58am

Thanks @Morganh for the answer. I am looking into it. I have got a couple of questions. First, is it possible to save the preprocessed images from deepstream. Second, how can I disable the preprocessing in deepstream so I can feed the preprocessed images saved from tao deploy to the trt engine?

Topic		Replies	Views
Fine-tuned TAO ClassificationTF2 Accuracy Drop after Compiling to TensorRT TAO Toolkit	34	912	August 6, 2024
TAO 5.0 Classification (PyTorch) deploy error TAO Toolkit	49	1591	September 11, 2023
Issue with image classification tutorial and testing with deepstream-app TAO Toolkit tensorrt , jetson-inference	34	5899	October 12, 2021
Custom TAO unet model classifying only two classes on Deepstream! TAO Toolkit	34	1853	May 12, 2022
For same frame I get different output using .tlt and .engine TAO Toolkit	24	1663	October 12, 2021
Deepstream infrence gives no detection TAO Toolkit	28	1972	December 9, 2021
Image Classification (TF2) Deesptream/Triton Server Config File TAO Toolkit	2	423	March 3, 2023
Tao_voc/classification: can not classify in Deepstream test2 TAO Toolkit	2	420	September 16, 2022
How to integrate TAO custom trained model 's tao export and tao deploy files with Deep Streem DeepStream SDK	8	416	January 18, 2023
Deploying Custom Trained Yolov4 model on Deepstream 6.2 sdk DeepStream SDK	21	1019	March 17, 2023

Issues with tao classifier_tf2 in deepstream (Accuracy drops)

Related topics