Deepstream with tlt resnet50 model giving unknown warning

I have retrained a tlt resnet50 model following this https://docs.nvidia.com/metropolis/TLT/tlt-getting-started-guide/ but once i run its giving this warning

WARNING: Num classes mismatch. Configured:5, detected by network: 300 4 1

and the through is very less around 10 ffs is it because of the warning or i should ignore it also how can i improve the throughput?

–>deepstream config file

Copyright (c) 2018 NVIDIA Corporation. All rights reserved.

NVIDIA Corporation and its licensors retain all intellectual property

and proprietary rights in and to this software, related documentation

and any modifications thereto. Any use, reproduction, disclosure or

distribution of this software and related documentation without an express

license agreement from NVIDIA Corporation is strictly prohibited.

[application]
enable-perf-measurement=1
perf-measurement-interval-sec=5
kitti-track-output-dir=/dfs/AutomationWorkspace/metadata/20191016-170001/camera16/cam16Concat_28fps

#gie-kitti-output-dir=streamscl

[tiled-display]
enable=0
rows=1
columns=1
width=1280
height=720
gpu-id=1
#(0): nvbuf-mem-default - Default memory allocated, specific to particular platform
#(1): nvbuf-mem-cuda-pinned - Allocate Pinned/Host cuda memory
#(2): nvbuf-mem-cuda-device - Allocate Device cuda memory
#(3): nvbuf-mem-cuda-unified - Allocate Unified cuda memory
#(4): nvbuf-mem-surface-array - Allocate Surface Array memory, applicable for Jetson
#(5): nvbuf-mem-handle - Allocate Surface Handle memory, applicable for Jetson
#(6): nvbuf-mem-system - Allocate Surface System memory, allocated using calloc
nvbuf-memory-type=0

[source0]
enable=1
#Type - 1=CameraV4L2 2=URI 3=MultiURI
type=3
uri=file:/dfs/AutomationWorkspace/EncodedVideos/20191016-170001/camera16/cam16Concat_28fps.mp4
#uri=file:/software/Videos_Concatenated/28fpsvideo_Encoded.mp4
num-sources=1
gpu-id=1

(0): memtype_device - Memory type Device

(1): memtype_pinned - Memory type Host Pinned

(2): memtype_unified - Memory type Unified

cudadec-memtype=0

[sink1]
enable=1
type=1
output-file=/dfs/AutomationWorkspace/2019-09-17-01200-01500_objdt.mp4
#1=mp4 2=mkv
container=1
#1=h264 2=h265 3=mpeg4

only SW mpeg4 is supported right now.

codec=3
sync=0
gpu-id=1
#iframeinterval=10
bitrate=2000000
#output-file=/software/td_cafe/take11/camera16/2019-09-17-01200-01500_objdt.mp4
source-id=0

[sink0]
enable=0
#Type - 1=FakeSink 2=EglSink 3=File
type=2
sync=0
source-id=0
gpu-id=1
nvbuf-memory-type=0

[osd]
enable=1
gpu-id=1
border-width=1
text-size=15
text-color=1;1;1;1;
text-bg-color=0.3;0.3;0.3;1
font=Arial
show-clock=0
clock-x-offset=800
clock-y-offset=820
clock-text-size=12
clock-color=1;0;0;0
nvbuf-memory-type=0

[streammux]
gpu-id=1
##Boolean property to inform muxer that sources are live
live-source=0
batch-size=4
##time out in usec, to wait after the first buffer is available
##to push the batch even if the complete batch is not formed
batched-push-timeout=40000

Set muxer output width and height

width=1280
height=720
#num-surfaces-per-frame=31
##Enable to maintain aspect ratio wrt source, and allow black borders, works
##along with width, height properties
enable-padding=0
nvbuf-memory-type=0

config-file property is mandatory for any gie section.

Other properties are optional and if set will override the properties set in

the infer config file.

[primary-gie]
enable=1
gpu-id=1
#model-engine-file=model_b4_int8.engine
labelfile-path=frcnn_labels.txt
batch-size=4
#Required by the app for OSD, not a plugin property
bbox-border-color0=1;0;0;1
bbox-border-color1=0;1;1;1
bbox-border-color2=0;0;1;1
bbox-border-color3=0;1;0;1
interval=0
gie-unique-id=1
nvbuf-memory-type=0
config-file=config_infer_primary_resnet50.txt

[tracker]
enable=1
tracker-width=320
tracker-height=180
#ll-lib-file=/usr/local/deepstream/libnvds_mot_iou.so
#ll-lib-file=/opt/nvidia/deepstream/deepstream-4.0/lib/libnvds_mot_klt.so
#ll-lib-file=/usr/local/deepstream/libnvds_mot_klt.so
#ll-lib-file=/usr/local/deepstream/libnvds_tracker.so
ll-lib-file=/opt/nvidia/deepstream/deepstream-4.0/lib/libnvds_nvdcf.so
#ll-config-file required for IOU only
ll-config-file=/root/deepstream_sdk_v4.0_x86_64/samples/configs/deepstream-app/tracker_config.yml
#ll-config-file=iou_config.txt
gpu-id=1
enable-batch-process=1

[tests]
file-loop=0

–>resnet50 config file
[property]
gpu-id=0
net-scale-factor=1.0
offsets=103.939;116.779;123.68
model-color-format=1
labelfile-path=frcnn_labels.txt

Provide the .etlt model exported by TLT or a TensorRT engine created by tlt-converter

If use .etlt model, please also specify the key(‘nvidia_tlt’)

model-engine-file=./rcnn.engine

tlt-encoded-model=frcnn_kitti_1.etlt
tlt-model-key=cmswbDk2OHFwcWgwZzAzdWw2ZzVkZjFlbWs6N2ZkMjFhMGItZmVhMS00NzRmLTk2YTQtOTU5NmUwNDAzMDlk
uff-input-dims=3;384;1280;0
uff-input-blob-name=input_1
batch-size=1

0=FP32, 1=INT8, 2=FP16 mode

network-mode=0
num-detected-classes=5
interval=1
gie-unique-id=1
is-classifier=0
#network-type=0
output-blob-names=dense_regress/BiasAdd;dense_class/Softmax;proposal
parse-bbox-func-name=NvDsInferParseCustomFrcnnUff
custom-lib-path=nvdsinfer_customparser_frcnn_uff/libnvds_infercustomparser_frcnn_uff.so

[class-attrs-all]
roi-top-offset=0
roi-bottom-offset=0
detected-min-w=0
detected-min-h=0
detected-max-w=0
detected-max-h=0

Hi muhammad,
Which Jetson platform are you running, Nano?
Could you please paste the full log along with the running command here? Thanks.

Hi Morganh
Thanks for the reply

We’re running on telsa gpus

→ Logs
root@glistergpu1:~/deepstream_sdk_v4.0_x86_64/sources/apps/sample_apps/deepstream-app# deepstream-app -c /root/deepstream_sdk_v4.0_x86_64/sources/objectDetector_Yolo/deepstream_app_config_resnet50.txt
Creating LL OSD context new
gstnvtracker: Loading low-level lib at /opt/nvidia/deepstream/deepstream-4.0/lib/libnvds_nvdcf.so
gstnvtracker: Optional NvMOT_RemoveStreams not implemented
gstnvtracker: Batch processing is ON
[NvDCF] Initialized
0:00:01.461487162 9915 0x55cbfc273f80 INFO nvinfer gstnvinfer.cpp:519:gst_nvinfer_logger:<primary_gie_classifier> NvDsInferContext[UID 1]:initialize(): Trying to create engine from model files
0:00:27.078436544 9915 0x55cbfc273f80 INFO nvinfer gstnvinfer.cpp:519:gst_nvinfer_logger:<primary_gie_classifier> NvDsInferContext[UID 1]:generateTRTModel(): Storing the serialized cuda engine to file at /root/deepstream_sdk_v4.0_x86_64/sources/objectDetector_Yolo/frcnn_kitti_1.etlt_b4_fp32.engine

Runtime commands:
h: Print this help
q: Quit

    p: Pause
    r: Resume

**PERF: FPS 0 (Avg)
**PERF: 0.00 (0.00)
** INFO: <bus_callback:163>: Pipeline ready

** INFO: <bus_callback:149>: Pipeline running

Creating LL OSD context new
WARNING: Num classes mismatch. Configured:5, detected by network: 300 4 1
**PERF: 11.91 (11.91)
**PERF: 11.16 (11.47)
**PERF: 11.16 (11.36)
**PERF: 11.34 (11.35)

Hi muhammad,
How many classes did you train for the frcnn network?
You can check your training log. More, please double check your frcnn_labels.txt accordingly.
Then please modify “num-detected-classes=5” to “num-detected-classes=4”.

I have changed it to 4 still same warning
WARNING: Num classes mismatch. Configured:4, detected by network: 300 4 1

→ Config file for training
random_seed: 42
enc_key: ‘cmswbDk2OHFwcWgwZzAzdWw2ZzVkZjFlbWs6N2ZkMjFhMGItZmVhMS00NzRmLTk2YTQtOTU5NmUwNDAzMDlk’
verbose: True
network_config {
input_image_config {
image_type: RGB
image_channel_order: ‘bgr’
size_height_width {
height: 384
width: 1280
}
image_channel_mean {
key: ‘b’
value: 103.939
}
image_channel_mean {
key: ‘g’
value: 116.779
}
image_channel_mean {
key: ‘r’
value: 123.68
}
image_scaling_factor: 1.0
}
feature_extractor: “resnet:50”
anchor_box_config {
scale: 64.0
scale: 128.0
scale: 256.0
ratio: 1.0
ratio: 0.5
ratio: 2.0
}
freeze_bn: True
freeze_blocks: 0
freeze_blocks: 1
roi_mini_batch: 256
rpn_stride: 16
conv_bn_share_bias: True
roi_pooling_config {
pool_size: 7
pool_size_2x: False
}
all_projections: True
use_pooling:False
}
training_config {
kitti_data_config {
images_dir : ‘/workspace/tlt-experiments/data/KITTI/training/image_2’
labels_dir: ‘/workspace/tlt-experiments/data/KITTI/training/label_2’
}
training_data_parser: ‘raw_kitti’
data_augmentation {
use_augmentation: True
spatial_augmentation {
hflip_probability: 0.5
vflip_probability: 0.0
zoom_min: 1.0
zoom_max: 1.0
translate_max_x: 0
translate_max_y: 0
}
color_augmentation {
color_shift_stddev: 0.0
hue_rotation_max: 0.0
saturation_shift_max: 0.0
contrast_scale_max: 0.0
contrast_center: 0.5
}
}
num_epochs: 12
class_mapping {
key: ‘Car’
value: 0
}
class_mapping {
key: ‘Van’
value: 0
}
class_mapping {
key: “Pedestrian”
value: 1
}
class_mapping {
key: “Person_sitting”
value: 1
}
class_mapping {
key: ‘Cyclist’
value: 2
}
class_mapping {
key: “background”
value: 3
}
class_mapping {
key: “DontCare”
value: -1
}
class_mapping {
key: “Truck”
value: -1
}
class_mapping {
key: “Misc”
value: -1
}
class_mapping {
key: “Tram”
value: -1
}
pretrained_weights: “/workspace/tlt-experiments/data/faster_rcnn/resnet50.h5”
pretrained_model: “”
output_weights: “/workspace/tlt-experiments/data/faster_rcnn/frcnn_kitti.tltw”
output_model: “/workspace/tlt-experiments/data/faster_rcnn/frcnn_kitti.tlt”
rpn_min_overlap: 0.3
rpn_max_overlap: 0.7
classifier_min_overlap: 0.0
classifier_max_overlap: 0.5
gt_as_roi: False
std_scaling: 1.0
classifier_regr_std {
key: ‘x’
value: 10.0
}
classifier_regr_std {
key: ‘y’
value: 10.0
}
classifier_regr_std {
key: ‘w’
value: 5.0
}
classifier_regr_std {
key: ‘h’
value: 5.0
}

rpn_mini_batch: 256
rpn_pre_nms_top_N: 12000
rpn_nms_max_boxes: 2000
rpn_nms_overlap_threshold: 0.7

reg_config {
reg_type: ‘L2’
weight_decay: 1e-4
}

optimizer {
adam {
lr: 0.00001
beta_1: 0.9
beta_2: 0.999
decay: 0.0
}
}

lr_scheduler {
step {
base_lr: 0.00001
gamma: 1.0
step_size: 30
}
}

lambda_rpn_regr: 1.0
lambda_rpn_class: 1.0
lambda_cls_regr: 1.0
lambda_cls_class: 1.0

inference_config {
images_dir: ‘/workspace/tlt-experiments/data/KITTI/val/image_2’
model: ‘/workspace/tlt-experiments/data/faster_rcnn/frcnn_kitti.epoch1.tlt’
detection_image_output_dir: ‘/workspace/tlt-experiments/data/faster_rcnn/inference_results_imgs’
labels_dump_dir: ‘/workspace/tlt-experiments/data/faster_rcnn/inference_dump_labels’
rpn_pre_nms_top_N: 6000
rpn_nms_max_boxes: 300
rpn_nms_overlap_threshold: 0.7
bbox_visualize_threshold: 0.6
classifier_nms_max_boxes: 300
classifier_nms_overlap_threshold: 0.3
}

evaluation_config {
dataset {
images_dir : ‘/workspace/tlt-experiments/data/KITTI/val/image_2’
labels_dir: ‘/workspace/tlt-experiments/data/KITTI/val/label_2’
}
data_parser: ‘raw_kitti’
model: ‘/workspace/tlt-experiments/data/faster_rcnn/frcnn_kitti.epoch1.tlt’
labels_dump_dir: ‘/workspace/tlt-experiments/data/faster_rcnn/test_dump_labels’
rpn_pre_nms_top_N: 6000
rpn_nms_max_boxes: 300
rpn_nms_overlap_threshold: 0.7
classifier_nms_max_boxes: 300
classifier_nms_overlap_threshold: 0.3
object_confidence_thres: 0.0001
use_voc07_11point_metric:False
}

}

----> config file for retraining
random_seed: 42
enc_key: ‘cmswbDk2OHFwcWgwZzAzdWw2ZzVkZjFlbWs6N2ZkMjFhMGItZmVhMS00NzRmLTk2YTQtOTU5NmUwNDAzMDlk’
verbose: True
network_config {
input_image_config {
image_type: RGB
image_channel_order: ‘bgr’
size_height_width {
height: 384
width: 1280
}
image_channel_mean {
key: ‘b’
value: 103.939
}
image_channel_mean {
key: ‘g’
value: 116.779
}
image_channel_mean {
key: ‘r’
value: 123.68
}
image_scaling_factor: 1.0
}
feature_extractor: “resnet:50”
anchor_box_config {
scale: 64.0
scale: 128.0
scale: 256.0
ratio: 1.0
ratio: 0.5
ratio: 2.0
}
freeze_bn: True
freeze_blocks: 0
freeze_blocks: 1
roi_mini_batch: 256
rpn_stride: 16
conv_bn_share_bias: True
roi_pooling_config {
pool_size: 7
pool_size_2x: False
}
all_projections: True
use_pooling:False
}
training_config {
kitti_data_config {
images_dir : ‘/workspace/tlt-experiments/data/KITTI/training/image_2’
labels_dir: ‘/workspace/tlt-experiments/data/KITTI/training/label_2’
}
training_data_parser: ‘raw_kitti’
data_augmentation {
use_augmentation: True
spatial_augmentation {
hflip_probability: 0.5
vflip_probability: 0.0
zoom_min: 1.0
zoom_max: 1.0
translate_max_x: 0
translate_max_y: 0
}
color_augmentation {
color_shift_stddev: 0.0
hue_rotation_max: 0.0
saturation_shift_max: 0.0
contrast_scale_max: 0.0
contrast_center: 0.5
}
}
num_epochs: 12
class_mapping {
key: ‘Car’
value: 0
}
class_mapping {
key: ‘Van’
value: 0
}
class_mapping {
key: “Pedestrian”
value: 1
}
class_mapping {
key: “Person_sitting”
value: 1
}
class_mapping {
key: ‘Cyclist’
value: 2
}
class_mapping {
key: “background”
value: 3
}
class_mapping {
key: “DontCare”
value: -1
}
class_mapping {
key: “Truck”
value: -1
}
class_mapping {
key: “Misc”
value: -1
}
class_mapping {
key: “Tram”
value: -1
}
pretrained_weights: “”
pretrained_model: “/workspace/tlt-experiments/data/faster_rcnn/model_2_pruned.tlt”
output_weights: “/workspace/tlt-experiments/data/faster_rcnn/frcnn_kitti_retrain.tltw”
output_model: “/workspace/tlt-experiments/data/faster_rcnn/frcnn_kitti_retrain.tlt”
rpn_min_overlap: 0.3
rpn_max_overlap: 0.7
classifier_min_overlap: 0.0
classifier_max_overlap: 0.5
gt_as_roi: False
std_scaling: 1.0
classifier_regr_std {
key: ‘x’
value: 10.0
}
classifier_regr_std {
key: ‘y’
value: 10.0
}
classifier_regr_std {
key: ‘w’
value: 5.0
}
classifier_regr_std {
key: ‘h’
value: 5.0
}

rpn_mini_batch: 256
rpn_pre_nms_top_N: 12000
rpn_nms_max_boxes: 2000
rpn_nms_overlap_threshold: 0.7

reg_config {
reg_type: ‘L2’
weight_decay: 1e-4
}

optimizer {
adam {
lr: 0.00001
beta_1: 0.9
beta_2: 0.999
decay: 0.0
}
}

lr_scheduler {
step {
base_lr: 0.00001
gamma: 1.0
step_size: 30
}
}

lambda_rpn_regr: 1.0
lambda_rpn_class: 1.0
lambda_cls_regr: 1.0
lambda_cls_class: 1.0

inference_config {
images_dir: ‘/workspace/tlt-experiments/data/KITTI/val/image_2’
model: ‘/workspace/tlt-experiments/data/faster_rcnn/frcnn_kitti_retrain.epoch1.tlt’
detection_image_output_dir: ‘/workspace/tlt-experiments/data/faster_rcnn/inference_results_imgs_retrain’
labels_dump_dir: ‘/workspace/tlt-experiments/data/faster_rcnn/inference_dump_labels_retrain’
rpn_pre_nms_top_N: 6000
rpn_nms_max_boxes: 300
rpn_nms_overlap_threshold: 0.7
bbox_visualize_threshold: 0.6
classifier_nms_max_boxes: 300
classifier_nms_overlap_threshold: 0.3
}

evaluation_config {
dataset {
images_dir : ‘/workspace/tlt-experiments/data/KITTI/val/image_2’
labels_dir: ‘/workspace/tlt-experiments/data/KITTI/val/label_2’
}
data_parser: ‘raw_kitti’
model: ‘/workspace/tlt-experiments/data/faster_rcnn/frcnn_kitti_retrain.epoch1.tlt’
labels_dump_dir: ‘/workspace/tlt-experiments/data/faster_rcnn/test_dump_labels_retrain’
rpn_pre_nms_top_N: 6000
rpn_nms_max_boxes: 300
rpn_nms_overlap_threshold: 0.7
classifier_nms_max_boxes: 300
classifier_nms_overlap_threshold: 0.3
object_confidence_thres: 0.0001
use_voc07_11point_metric:False
}

}

Could you please paste your frcnn_labels.txt? Thanks.

–>frcnn_label.txt

Automobile
Bicycle
Person
Roadsign
background

According to tlt doc,
The label file is a text file, containing the names of the classes that the FasterRCNN model is trained to detect. The order in which the classes are listed here must match the order in which the model predicts the output. This order is derived from the order the objects are instantiated in the class_mapping field of the FasterRCNN experiment specification file

So, according to your training spec, could you please modify your frcnn_label.txt as below and try again?

Car
Pedestrian
Cyclist
background

Tried it still no change

Creating LL OSD context new
WARNING: Num classes mismatch. Configured:4, detected by network: 300 4 1

Hi muhammad,
There might be a false warning in https://github.com/NVIDIA-AI-IOT/deepstream_4.x_apps/blob/master/nvdsinfer_customparser_frcnn_uff/nvdsinfer_custombboxparser_frcnn_uff.cpp
We will look into it.

Hi muhammad,
The fix is already available in github.
Thanks very much for your finding and contribution.

@@ -309,7 +309,7 @@ bool NvDsInferParseCustomFrcnnUff (std::vector<NvDsInferLayerInfo> const &output

    /* Warn in case of mismatch in number of classes */
    if (!classMismatchWarn) {
-       if (covLayerDims.c != detectionParams.numClassesConfigured) {
+       if (covLayerDims.h != detectionParams.numClassesConfigured) {
            std::cerr << "WARNING: Num classes mismatch. Configured:" <<
                      detectionParams.numClassesConfigured << ", detected by network: " <<
                      covLayerDims.c << " " << covLayerDims.h << " " << covLayerDims.w << std::endl;