TAO deploy docker eval accuracy lower vs TAO tf1 docker

I used PeopleNet, and retrained it using the sample notebook and the reworked detectnet sample spec file. I was able to achieve around 82% percent mAP, but get around 60% using the pre/post processing from the tao deploy backend. I had changed these scripts to run inference on the .onnx exported model, and achieved these numbers with the .onnx model. Let me know what additional info I should provide, thank you.

May I know if you are running the command tao deploy detectnet_v2 gen_trt_engine xxx" to generate tensorrt engine and then run tao deploy detectnet_v2 evaluate xxx`?
Refer to tao_tutorials/notebooks/tao_launcher_starter_kit/detectnet_v2/detectnet_v2.ipynb at main · NVIDIA/tao_tutorials · GitHub.

Did you set --onnx_route tf2onnx when run tao model detectnet_v2 export?

I am not doing gen_trt_engine, I just modified the tao deploy evaluate script to use the onnx model for inference, then ran this and got the poor accuracy. I did set --onnx_route tf2onnx, and after modifying the tao tf1 backend eval script, got a similar accuracy with the onnx model as with the .tlt model, so I know the .onnx model exported correctly. So with the same onnx model, I am getting different precision numbers between the scripts, with tao deploy being much worse than the tao tf1 eval scripts numbers for .tlt and .onnx

OK, so the .onnx file evaluation in tao-tf1 backend is correct.

So, the .onnx file evaluation in tao-deploy backend is not good.

To narrow down, can you compare these two cases in terms of preprocessing and postprocessing? For example, for preprocessing, please check if the input array is the same before feeding to the onnx file.

Sounds good, I will check. Also, I just ran tao deploy detectnet_v2 gen_trt_engine with my onnx file, and ran both the evaluate scripts on the tensorrt engine, unmodified, to compare. With the tao tf1 backend script, in the 5.0.0-tf1.15.5 container, I got these stats:Mean average_precision (in %): 81.9608

±-----------±-------------------------+
| class name | average precision (in %) |
±-----------±-------------------------+
| car | 79.68903131394937 |
| cyclist | 88.703266082089 |
| pedestrian | 77.49001246669008 |
±-----------±-------------------------+

With the tao deploy one, in the 5.0.0-deploy container, I got :
{“AP_pedestrian”: 0.3743627745760669, “AP_car”: 0.7023298344921652, “AP_cyclist”: 0.48978648688513293}

So I can reproduce the precision gap with tensorrt as well

Looks like there is a difference in preprocessing. Here is an example of the inputs into the model for tao deploy:

Here is an example for tao tf1:

After changing postprocessing to match, my precision improved to higher than the tao tf1 backend one, which is interesting, but issue resolved overall. I’m looking to get the same or around the same precision though.

Could you share your finding?

May I know that which is changed?

Looks like the preprocessing in the tf1 backend one resizes the image to 960 x 375, then pads the bottom to make it 960 x 544. The deploy one resized the image to 960 x 544, with no padding. In the postprocessing, all the detected bounding boxes from model inference were scaled up to the original image size, but as the image height didn’t really change (padding was just added), I removed the scale for the image height.

Hi, @anthonyp0329 ,
Could you share the spec file when you run evaluation with tf1 and tao-deploy?

random_seed: 42
dataset_config {
data_sources {
tfrecords_path:
image_directory_path:
}
image_extension: “png”
target_class_mapping {
key: “pedestrian”
value: “pedestrian”
}
target_class_mapping {
key: “person_sitting”
value: “pedestrian”
}
target_class_mapping {
key: “car”
value: “car”
}
target_class_mapping {
key: “cyclist”
value: “cyclist”
}

target_class_mapping {
key: “van”
value: “car”
}
validation_fold: 0
}
augmentation_config {
preprocessing {
output_image_width: 960
output_image_height: 544
crop_right: 960
crop_bottom: 544
min_bbox_width: 1.0
min_bbox_height: 1.0
output_image_channel: 3
}
spatial_augmentation {
hflip_probability: 0.5
zoom_min: 1.0
zoom_max: 1.0
translate_max_x: 8.0
translate_max_y: 8.0
}
color_augmentation {
hue_rotation_max: 25.0
saturation_shift_max: 0.20000000298023224
contrast_scale_max: 0.10000000149011612
contrast_center: 0.5
}
}
postprocessing_config {
target_class_config {
key: “pedestrian”
value {
clustering_config {
coverage_threshold: 0.007499999832361937
minimum_bounding_box_height: 20
dbscan_eps: 0.23000000417232513
dbscan_min_samples: 1
dbscan_confidence_threshold: 0.8999999761581421
}
}
}
target_class_config {
key: “car”
value {
clustering_config {
coverage_threshold: 0.004999999888241291
minimum_bounding_box_height: 20
dbscan_eps: 0.20000000298023224
dbscan_min_samples: 1
dbscan_confidence_threshold: 0.8999999761581421
}
}
}
target_class_config {
key: “cyclist”
value {
clustering_config {
coverage_threshold: 0.004999999888241291
minimum_bounding_box_height: 20
dbscan_eps: 0.15000000596046448
dbscan_min_samples: 1
dbscan_confidence_threshold: 0.8999999761581421
}
}
}

}
model_config {
pretrained_model_file:
num_layers: 34
use_batch_norm: true
objective_set {
bbox {
scale: 35.0
offset: 0.5
}
cov {
}
}
training_precision {
}
arch: “resnet”
all_projections: true
load_graph: true
}
evaluation_config {
validation_period_during_training: 10
first_validation_epoch: 30
minimum_detection_ground_truth_overlap {
key: “pedestrian”
value: 0.5
}
minimum_detection_ground_truth_overlap {
key: “car”
value: 0.699999988079071
}
minimum_detection_ground_truth_overlap {
key: “cyclist”
value: 0.5
}

evaluation_box_config {
key: “pedestrian”
value {
minimum_height: 20
maximum_height: 9999
minimum_width: 10
maximum_width: 9999
}
}
evaluation_box_config {
key: “car”
value {
minimum_height: 20
maximum_height: 9999
minimum_width: 10
maximum_width: 9999
}
}
evaluation_box_config {
key: “cyclist”
value {
minimum_height: 20
maximum_height: 9999
minimum_width: 10
maximum_width: 9999
}
}

average_precision_mode: INTEGRATE
}
cost_function_config {
target_classes {
name: “pedestrian”
class_weight: 4.0
coverage_foreground_weight: 0.05000000074505806
objectives {
name: “cov”
initial_weight: 1.0
weight_target: 1.0
}
objectives {
name: “bbox”
initial_weight: 10.0
weight_target: 10.0
}
}
target_classes {
name: “car”
class_weight: 1.0
coverage_foreground_weight: 0.05000000074505806
objectives {
name: “cov”
initial_weight: 1.0
weight_target: 1.0
}
objectives {
name: “bbox”
initial_weight: 10.0
weight_target: 10.0
}
}
target_classes {
name: “cyclist”
class_weight: 8.0
coverage_foreground_weight: 0.05000000074505806
objectives {
name: “cov”
initial_weight: 1.0
weight_target: 1.0
}
objectives {
name: “bbox”
initial_weight: 10.0
weight_target: 1.0
}
}
max_objective_weight: 0.9998999834060669
min_objective_weight: 9.999999747378752e-05
}
training_config {
batch_size_per_gpu: 1
num_epochs: 120
learning_rate {
soft_start_annealing_schedule {
min_learning_rate: 1e-06
max_learning_rate: 0.0001
soft_start: 0.05
annealing: 0.95
}
}
regularizer {
type: L1
weight: 3.000000026176508e-09
}
optimizer {
adam {
epsilon: 9.899999930951253e-09
beta1: 0.8999999761581421
beta2: 0.9990000128746033
}
}
cost_scaling {
initial_exponent: 20.0
increment: 0.005
decrement: 1.0
}
checkpoint_interval: 10
}
bbox_rasterizer_config {
target_class_config {
key: “pedestrian”
value {
cov_center_x: 0.5
cov_center_y: 0.5
cov_radius_x: 1.0
cov_radius_y: 1.0
bbox_min_radius: 1.0
}
}
target_class_config {
key: “car”
value {
cov_center_x: 0.5
cov_center_y: 0.5
cov_radius_x: 0.4000000059604645
cov_radius_y: 0.4000000059604645
bbox_min_radius: 1.0
}
}
target_class_config {
key: “cyclist”
value {
cov_center_x: 0.5
cov_center_y: 0.5
cov_radius_x: 1.0
cov_radius_y: 1.0
bbox_min_radius: 1.0
}
}

deadzone_radius: 0.4000001549720764
}

User can set enable_auto_resize: true in the training spec file. More info can be found in user guide DetectNet_v2 - NVIDIA Docs. Then the preprocessing should be aligned.

There is no update from you for a period, assuming this is not an issue anymore. Hence we are closing this topic. If need further support, please open a new one. Thanks

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.