Deepstream with tlt resnet50 model giving unknown warning

muhammad.rana · November 20, 2019, 5:53am

I have retrained a tlt resnet50 model following this https://docs.nvidia.com/metropolis/TLT/tlt-getting-started-guide/ but once i run its giving this warning

WARNING: Num classes mismatch. Configured:5, detected by network: 300 4 1

and the through is very less around 10 ffs is it because of the warning or i should ignore it also how can i improve the throughput?

–>deepstream config file

Copyright (c) 2018 NVIDIA Corporation. All rights reserved.

NVIDIA Corporation and its licensors retain all intellectual property

and proprietary rights in and to this software, related documentation

and any modifications thereto. Any use, reproduction, disclosure or

distribution of this software and related documentation without an express

license agreement from NVIDIA Corporation is strictly prohibited.

[application]
enable-perf-measurement=1
perf-measurement-interval-sec=5
kitti-track-output-dir=/dfs/AutomationWorkspace/metadata/20191016-170001/camera16/cam16Concat_28fps

#gie-kitti-output-dir=streamscl

[tiled-display]
enable=0
rows=1
columns=1
width=1280
height=720
gpu-id=1
#(0): nvbuf-mem-default - Default memory allocated, specific to particular platform
#(1): nvbuf-mem-cuda-pinned - Allocate Pinned/Host cuda memory
#(2): nvbuf-mem-cuda-device - Allocate Device cuda memory
#(3): nvbuf-mem-cuda-unified - Allocate Unified cuda memory
#(4): nvbuf-mem-surface-array - Allocate Surface Array memory, applicable for Jetson
#(5): nvbuf-mem-handle - Allocate Surface Handle memory, applicable for Jetson
#(6): nvbuf-mem-system - Allocate Surface System memory, allocated using calloc
nvbuf-memory-type=0

[source0]
enable=1
#Type - 1=CameraV4L2 2=URI 3=MultiURI
type=3
uri=file:/dfs/AutomationWorkspace/EncodedVideos/20191016-170001/camera16/cam16Concat_28fps.mp4
#uri=file:/software/Videos_Concatenated/28fpsvideo_Encoded.mp4
num-sources=1
gpu-id=1

(0): memtype_device - Memory type Device

(1): memtype_pinned - Memory type Host Pinned

(2): memtype_unified - Memory type Unified

cudadec-memtype=0

[sink1]
enable=1
type=1
output-file=/dfs/AutomationWorkspace/2019-09-17-01200-01500_objdt.mp4
#1=mp4 2=mkv
container=1
#1=h264 2=h265 3=mpeg4

only SW mpeg4 is supported right now.

codec=3
sync=0
gpu-id=1
#iframeinterval=10
bitrate=2000000
#output-file=/software/td_cafe/take11/camera16/2019-09-17-01200-01500_objdt.mp4
source-id=0

[sink0]
enable=0
#Type - 1=FakeSink 2=EglSink 3=File
type=2
sync=0
source-id=0
gpu-id=1
nvbuf-memory-type=0

[osd]
enable=1
gpu-id=1
border-width=1
text-size=15
text-color=1;1;1;1;
text-bg-color=0.3;0.3;0.3;1
font=Arial
show-clock=0
clock-x-offset=800
clock-y-offset=820
clock-text-size=12
clock-color=1;0;0;0
nvbuf-memory-type=0

[streammux]
gpu-id=1
##Boolean property to inform muxer that sources are live
live-source=0
batch-size=4
##time out in usec, to wait after the first buffer is available
##to push the batch even if the complete batch is not formed
batched-push-timeout=40000

Set muxer output width and height

width=1280
height=720
#num-surfaces-per-frame=31
##Enable to maintain aspect ratio wrt source, and allow black borders, works
##along with width, height properties
enable-padding=0
nvbuf-memory-type=0

config-file property is mandatory for any gie section.

Other properties are optional and if set will override the properties set in

the infer config file.

[primary-gie]
enable=1
gpu-id=1
#model-engine-file=model_b4_int8.engine
labelfile-path=frcnn_labels.txt
batch-size=4
#Required by the app for OSD, not a plugin property
bbox-border-color0=1;0;0;1
bbox-border-color1=0;1;1;1
bbox-border-color2=0;0;1;1
bbox-border-color3=0;1;0;1
interval=0
gie-unique-id=1
nvbuf-memory-type=0
config-file=config_infer_primary_resnet50.txt

[tracker]
enable=1
tracker-width=320
tracker-height=180
#ll-lib-file=/usr/local/deepstream/libnvds_mot_iou.so
#ll-lib-file=/opt/nvidia/deepstream/deepstream-4.0/lib/libnvds_mot_klt.so
#ll-lib-file=/usr/local/deepstream/libnvds_mot_klt.so
#ll-lib-file=/usr/local/deepstream/libnvds_tracker.so
ll-lib-file=/opt/nvidia/deepstream/deepstream-4.0/lib/libnvds_nvdcf.so
#ll-config-file required for IOU only
ll-config-file=/root/deepstream_sdk_v4.0_x86_64/samples/configs/deepstream-app/tracker_config.yml
#ll-config-file=iou_config.txt
gpu-id=1
enable-batch-process=1

[tests]
file-loop=0

–>resnet50 config file
[property]
gpu-id=0
net-scale-factor=1.0
offsets=103.939;116.779;123.68
model-color-format=1
labelfile-path=frcnn_labels.txt

Provide the .etlt model exported by TLT or a TensorRT engine created by tlt-converter

If use .etlt model, please also specify the key(‘nvidia_tlt’)

model-engine-file=./rcnn.engine

tlt-encoded-model=frcnn_kitti_1.etlt
tlt-model-key=cmswbDk2OHFwcWgwZzAzdWw2ZzVkZjFlbWs6N2ZkMjFhMGItZmVhMS00NzRmLTk2YTQtOTU5NmUwNDAzMDlk
uff-input-dims=3;384;1280;0
uff-input-blob-name=input_1
batch-size=1

0=FP32, 1=INT8, 2=FP16 mode

network-mode=0
num-detected-classes=5
interval=1
gie-unique-id=1
is-classifier=0
#network-type=0
output-blob-names=dense_regress/BiasAdd;dense_class/Softmax;proposal
parse-bbox-func-name=NvDsInferParseCustomFrcnnUff
custom-lib-path=nvdsinfer_customparser_frcnn_uff/libnvds_infercustomparser_frcnn_uff.so

[class-attrs-all]
roi-top-offset=0
roi-bottom-offset=0
detected-min-w=0
detected-min-h=0
detected-max-w=0
detected-max-h=0

Morganh · November 20, 2019, 6:22am

Hi muhammad,
Which Jetson platform are you running, Nano?
Could you please paste the full log along with the running command here? Thanks.

muhammad.rana · November 20, 2019, 7:01am

Hi Morganh
Thanks for the reply

We’re running on telsa gpus

→ Logs
root@glistergpu1:~/deepstream_sdk_v4.0_x86_64/sources/apps/sample_apps/deepstream-app# deepstream-app -c /root/deepstream_sdk_v4.0_x86_64/sources/objectDetector_Yolo/deepstream_app_config_resnet50.txt
Creating LL OSD context new
gstnvtracker: Loading low-level lib at /opt/nvidia/deepstream/deepstream-4.0/lib/libnvds_nvdcf.so
gstnvtracker: Optional NvMOT_RemoveStreams not implemented
gstnvtracker: Batch processing is ON
[NvDCF] Initialized
0:00:01.461487162 9915 0x55cbfc273f80 INFO nvinfer gstnvinfer.cpp:519:gst_nvinfer_logger:<primary_gie_classifier> NvDsInferContext[UID 1]:initialize(): Trying to create engine from model files
0:00:27.078436544 9915 0x55cbfc273f80 INFO nvinfer gstnvinfer.cpp:519:gst_nvinfer_logger:<primary_gie_classifier> NvDsInferContext[UID 1]:generateTRTModel(): Storing the serialized cuda engine to file at /root/deepstream_sdk_v4.0_x86_64/sources/objectDetector_Yolo/frcnn_kitti_1.etlt_b4_fp32.engine

Runtime commands:
h: Print this help
q: Quit

    p: Pause
    r: Resume

**PERF: FPS 0 (Avg)
**PERF: 0.00 (0.00)
** INFO: <bus_callback:163>: Pipeline ready

** INFO: <bus_callback:149>: Pipeline running

Creating LL OSD context new
WARNING: Num classes mismatch. Configured:5, detected by network: 300 4 1
**PERF: 11.91 (11.91)
**PERF: 11.16 (11.47)
**PERF: 11.16 (11.36)
**PERF: 11.34 (11.35)

Morganh · November 20, 2019, 7:28am

Hi muhammad,
How many classes did you train for the frcnn network?
You can check your training log. More, please double check your frcnn_labels.txt accordingly.
Then please modify “num-detected-classes=5” to “num-detected-classes=4”.

muhammad.rana · November 20, 2019, 7:51am

I have changed it to 4 still same warning
WARNING: Num classes mismatch. Configured:4, detected by network: 300 4 1

→ Config file for training
random_seed: 42
enc_key: ‘cmswbDk2OHFwcWgwZzAzdWw2ZzVkZjFlbWs6N2ZkMjFhMGItZmVhMS00NzRmLTk2YTQtOTU5NmUwNDAzMDlk’
verbose: True
network_config {
input_image_config {
image_type: RGB
image_channel_order: ‘bgr’
size_height_width {
height: 384
width: 1280
}
image_channel_mean {
key: ‘b’
value: 103.939
}
image_channel_mean {
key: ‘g’
value: 116.779
}
image_channel_mean {
key: ‘r’
value: 123.68
}
image_scaling_factor: 1.0
}
feature_extractor: “resnet:50”
anchor_box_config {
scale: 64.0
scale: 128.0
scale: 256.0
ratio: 1.0
ratio: 0.5
ratio: 2.0
}
freeze_bn: True
freeze_blocks: 0
freeze_blocks: 1
roi_mini_batch: 256
rpn_stride: 16
conv_bn_share_bias: True
roi_pooling_config {
pool_size: 7
pool_size_2x: False
}
all_projections: True
use_pooling:False
}
training_config {
kitti_data_config {
images_dir : ‘/workspace/tlt-experiments/data/KITTI/training/image_2’
labels_dir: ‘/workspace/tlt-experiments/data/KITTI/training/label_2’
}
training_data_parser: ‘raw_kitti’
data_augmentation {
use_augmentation: True
spatial_augmentation {
hflip_probability: 0.5
vflip_probability: 0.0
zoom_min: 1.0
zoom_max: 1.0
translate_max_x: 0
translate_max_y: 0
}
color_augmentation {
color_shift_stddev: 0.0
hue_rotation_max: 0.0
saturation_shift_max: 0.0
contrast_scale_max: 0.0
contrast_center: 0.5
}
}
num_epochs: 12
class_mapping {
key: ‘Car’
value: 0
}
class_mapping {
key: ‘Van’
value: 0
}
class_mapping {
key: “Pedestrian”
value: 1
}
class_mapping {
key: “Person_sitting”
value: 1
}
class_mapping {
key: ‘Cyclist’
value: 2
}
class_mapping {
key: “background”
value: 3
}
class_mapping {
key: “DontCare”
value: -1
}
class_mapping {
key: “Truck”
value: -1
}
class_mapping {
key: “Misc”
value: -1
}
class_mapping {
key: “Tram”
value: -1
}
pretrained_weights: “/workspace/tlt-experiments/data/faster_rcnn/resnet50.h5”
pretrained_model: “”
output_weights: “/workspace/tlt-experiments/data/faster_rcnn/frcnn_kitti.tltw”
output_model: “/workspace/tlt-experiments/data/faster_rcnn/frcnn_kitti.tlt”
rpn_min_overlap: 0.3
rpn_max_overlap: 0.7
classifier_min_overlap: 0.0
classifier_max_overlap: 0.5
gt_as_roi: False
std_scaling: 1.0
classifier_regr_std {
key: ‘x’
value: 10.0
}
classifier_regr_std {
key: ‘y’
value: 10.0
}
classifier_regr_std {
key: ‘w’
value: 5.0
}
classifier_regr_std {
key: ‘h’
value: 5.0
}

rpn_mini_batch: 256
rpn_pre_nms_top_N: 12000
rpn_nms_max_boxes: 2000
rpn_nms_overlap_threshold: 0.7

reg_config {
reg_type: ‘L2’
weight_decay: 1e-4
}

optimizer {
adam {
lr: 0.00001
beta_1: 0.9
beta_2: 0.999
decay: 0.0
}
}

lr_scheduler {
step {
base_lr: 0.00001
gamma: 1.0
step_size: 30
}
}

lambda_rpn_regr: 1.0
lambda_rpn_class: 1.0
lambda_cls_regr: 1.0
lambda_cls_class: 1.0

inference_config {
images_dir: ‘/workspace/tlt-experiments/data/KITTI/val/image_2’
model: ‘/workspace/tlt-experiments/data/faster_rcnn/frcnn_kitti.epoch1.tlt’
detection_image_output_dir: ‘/workspace/tlt-experiments/data/faster_rcnn/inference_results_imgs’
labels_dump_dir: ‘/workspace/tlt-experiments/data/faster_rcnn/inference_dump_labels’
rpn_pre_nms_top_N: 6000
rpn_nms_max_boxes: 300
rpn_nms_overlap_threshold: 0.7
bbox_visualize_threshold: 0.6
classifier_nms_max_boxes: 300
classifier_nms_overlap_threshold: 0.3
}

evaluation_config {
dataset {
images_dir : ‘/workspace/tlt-experiments/data/KITTI/val/image_2’
labels_dir: ‘/workspace/tlt-experiments/data/KITTI/val/label_2’
}
data_parser: ‘raw_kitti’
model: ‘/workspace/tlt-experiments/data/faster_rcnn/frcnn_kitti.epoch1.tlt’
labels_dump_dir: ‘/workspace/tlt-experiments/data/faster_rcnn/test_dump_labels’
rpn_pre_nms_top_N: 6000
rpn_nms_max_boxes: 300
rpn_nms_overlap_threshold: 0.7
classifier_nms_max_boxes: 300
classifier_nms_overlap_threshold: 0.3
object_confidence_thres: 0.0001
use_voc07_11point_metric:False
}

}

----> config file for retraining
random_seed: 42
enc_key: ‘cmswbDk2OHFwcWgwZzAzdWw2ZzVkZjFlbWs6N2ZkMjFhMGItZmVhMS00NzRmLTk2YTQtOTU5NmUwNDAzMDlk’
verbose: True
network_config {
input_image_config {
image_type: RGB
image_channel_order: ‘bgr’
size_height_width {
height: 384
width: 1280
}
image_channel_mean {
key: ‘b’
value: 103.939
}
image_channel_mean {
key: ‘g’
value: 116.779
}
image_channel_mean {
key: ‘r’
value: 123.68
}
image_scaling_factor: 1.0
}
feature_extractor: “resnet:50”
anchor_box_config {
scale: 64.0
scale: 128.0
scale: 256.0
ratio: 1.0
ratio: 0.5
ratio: 2.0
}
freeze_bn: True
freeze_blocks: 0
freeze_blocks: 1
roi_mini_batch: 256
rpn_stride: 16
conv_bn_share_bias: True
roi_pooling_config {
pool_size: 7
pool_size_2x: False
}
all_projections: True
use_pooling:False
}
training_config {
kitti_data_config {
images_dir : ‘/workspace/tlt-experiments/data/KITTI/training/image_2’
labels_dir: ‘/workspace/tlt-experiments/data/KITTI/training/label_2’
}
training_data_parser: ‘raw_kitti’
data_augmentation {
use_augmentation: True
spatial_augmentation {
hflip_probability: 0.5
vflip_probability: 0.0
zoom_min: 1.0
zoom_max: 1.0
translate_max_x: 0
translate_max_y: 0
}
color_augmentation {
color_shift_stddev: 0.0
hue_rotation_max: 0.0
saturation_shift_max: 0.0
contrast_scale_max: 0.0
contrast_center: 0.5
}
}
num_epochs: 12
class_mapping {
key: ‘Car’
value: 0
}
class_mapping {
key: ‘Van’
value: 0
}
class_mapping {
key: “Pedestrian”
value: 1
}
class_mapping {
key: “Person_sitting”
value: 1
}
class_mapping {
key: ‘Cyclist’
value: 2
}
class_mapping {
key: “background”
value: 3
}
class_mapping {
key: “DontCare”
value: -1
}
class_mapping {
key: “Truck”
value: -1
}
class_mapping {
key: “Misc”
value: -1
}
class_mapping {
key: “Tram”
value: -1
}
pretrained_weights: “”
pretrained_model: “/workspace/tlt-experiments/data/faster_rcnn/model_2_pruned.tlt”
output_weights: “/workspace/tlt-experiments/data/faster_rcnn/frcnn_kitti_retrain.tltw”
output_model: “/workspace/tlt-experiments/data/faster_rcnn/frcnn_kitti_retrain.tlt”
rpn_min_overlap: 0.3
rpn_max_overlap: 0.7
classifier_min_overlap: 0.0
classifier_max_overlap: 0.5
gt_as_roi: False
std_scaling: 1.0
classifier_regr_std {
key: ‘x’
value: 10.0
}
classifier_regr_std {
key: ‘y’
value: 10.0
}
classifier_regr_std {
key: ‘w’
value: 5.0
}
classifier_regr_std {
key: ‘h’
value: 5.0
}

rpn_mini_batch: 256
rpn_pre_nms_top_N: 12000
rpn_nms_max_boxes: 2000
rpn_nms_overlap_threshold: 0.7

reg_config {
reg_type: ‘L2’
weight_decay: 1e-4
}

optimizer {
adam {
lr: 0.00001
beta_1: 0.9
beta_2: 0.999
decay: 0.0
}
}

lr_scheduler {
step {
base_lr: 0.00001
gamma: 1.0
step_size: 30
}
}

lambda_rpn_regr: 1.0
lambda_rpn_class: 1.0
lambda_cls_regr: 1.0
lambda_cls_class: 1.0

inference_config {
images_dir: ‘/workspace/tlt-experiments/data/KITTI/val/image_2’
model: ‘/workspace/tlt-experiments/data/faster_rcnn/frcnn_kitti_retrain.epoch1.tlt’
detection_image_output_dir: ‘/workspace/tlt-experiments/data/faster_rcnn/inference_results_imgs_retrain’
labels_dump_dir: ‘/workspace/tlt-experiments/data/faster_rcnn/inference_dump_labels_retrain’
rpn_pre_nms_top_N: 6000
rpn_nms_max_boxes: 300
rpn_nms_overlap_threshold: 0.7
bbox_visualize_threshold: 0.6
classifier_nms_max_boxes: 300
classifier_nms_overlap_threshold: 0.3
}

evaluation_config {
dataset {
images_dir : ‘/workspace/tlt-experiments/data/KITTI/val/image_2’
labels_dir: ‘/workspace/tlt-experiments/data/KITTI/val/label_2’
}
data_parser: ‘raw_kitti’
model: ‘/workspace/tlt-experiments/data/faster_rcnn/frcnn_kitti_retrain.epoch1.tlt’
labels_dump_dir: ‘/workspace/tlt-experiments/data/faster_rcnn/test_dump_labels_retrain’
rpn_pre_nms_top_N: 6000
rpn_nms_max_boxes: 300
rpn_nms_overlap_threshold: 0.7
classifier_nms_max_boxes: 300
classifier_nms_overlap_threshold: 0.3
object_confidence_thres: 0.0001
use_voc07_11point_metric:False
}

}

Morganh · November 20, 2019, 8:02am

Could you please paste your frcnn_labels.txt? Thanks.

muhammad.rana · November 20, 2019, 8:06am

–>frcnn_label.txt

Automobile
Bicycle
Person
Roadsign
background

Morganh · November 20, 2019, 8:13am

According to tlt doc,
The label file is a text file, containing the names of the classes that the FasterRCNN model is trained to detect. The order in which the classes are listed here must match the order in which the model predicts the output. This order is derived from the order the objects are instantiated in the class_mapping field of the FasterRCNN experiment specification file

So, according to your training spec, could you please modify your frcnn_label.txt as below and try again?

Car
Pedestrian
Cyclist
background

muhammad.rana · November 20, 2019, 9:04am

Tried it still no change

Creating LL OSD context new
WARNING: Num classes mismatch. Configured:4, detected by network: 300 4 1

Morganh · November 20, 2019, 10:07am

Hi muhammad,
There might be a false warning in https://github.com/NVIDIA-AI-IOT/deepstream_4.x_apps/blob/master/nvdsinfer_customparser_frcnn_uff/nvdsinfer_custombboxparser_frcnn_uff.cpp
We will look into it.

Morganh · November 21, 2019, 3:01am

Hi muhammad,
The fix is already available in github.
Thanks very much for your finding and contribution.

@@ -309,7 +309,7 @@ bool NvDsInferParseCustomFrcnnUff (std::vector<NvDsInferLayerInfo> const &output

    /* Warn in case of mismatch in number of classes */
    if (!classMismatchWarn) {
-       if (covLayerDims.c != detectionParams.numClassesConfigured) {
+       if (covLayerDims.h != detectionParams.numClassesConfigured) {
            std::cerr << "WARNING: Num classes mismatch. Configured:" <<
                      detectionParams.numClassesConfigured << ", detected by network: " <<
                      covLayerDims.c << " " << covLayerDims.h << " " << covLayerDims.w << std::endl;

Topic		Replies	Views
Faster RCNN ResNet-101 Problems TAO Toolkit	20	1100	October 12, 2021
Training Custom FasterRCNN resnet50 Object detection issue TAO Toolkit	9	1117	October 12, 2021
Issue with image classification tutorial and testing with deepstream-app TAO Toolkit tensorrt , jetson-inference	34	5790	October 12, 2021
Resnet-50 based uff-model is giving error due to mismatch. DeepStream SDK	17	2237	November 30, 2018
Little to no detection on Deepstream-App compared to TLT's infer using the same model TAO Toolkit	6	629	October 12, 2021
GRAYSCALE as image_type not working with tlt-train faster_rcnn TAO Toolkit	13	672	October 12, 2021
LPD training model not OK DeepStream SDK	17	387	July 18, 2022
Adapting MQTT Configuration for Customized Deepstream Models: Need Assistance DeepStream SDK	12	392	July 16, 2024
AP, precision and recall are remaining zero using the custom dataset. training the fasterRCNN with resnet_18 TAO Toolkit	12	732	November 5, 2021
How to run Nvidia's example torch SSD net on Deepstream-App with objectDetector_SSD's custom plugin DeepStream SDK	10	1001	October 12, 2021

Deepstream with tlt resnet50 model giving unknown warning

Copyright (c) 2018 NVIDIA Corporation. All rights reserved.

NVIDIA Corporation and its licensors retain all intellectual property

and proprietary rights in and to this software, related documentation

and any modifications thereto. Any use, reproduction, disclosure or

distribution of this software and related documentation without an express

license agreement from NVIDIA Corporation is strictly prohibited.

(0): memtype_device - Memory type Device

(1): memtype_pinned - Memory type Host Pinned

(2): memtype_unified - Memory type Unified

only SW mpeg4 is supported right now.

Set muxer output width and height

config-file property is mandatory for any gie section.

Other properties are optional and if set will override the properties set in

the infer config file.

Provide the .etlt model exported by TLT or a TensorRT engine created by tlt-converter

If use .etlt model, please also specify the key(‘nvidia_tlt’)

model-engine-file=./rcnn.engine

0=FP32, 1=INT8, 2=FP16 mode

Related topics