Segmentation fault when using secondary classifier

Test3 change the model

I have changed my model to a very simple one with random weights just to verify it. Current model consist on a conv2d layer with input of 3 channels and output of 128 channels. Followd by an adaptive average pooling with output size of 1.
Input to network [1,3,200,200] name: input
output [1,128] name: output
(facenet/simple.onnx at master · dangraf/facenet · GitHub)

I converted the model to an engine file and renamed it.
I also tried to use the inferserver plugin where it’s possible to disable all post processing:

[config.pbtxt]
ame: “facenet_trt”
platform: “tensorrt_plan”
max_batch_size: 1
default_model_filename: “facenet.onnx_b1_gpu0_fp16.engine”
input [
{
name: “input”
data_type: TYPE_FP32
format: FORMAT_NCHW
dims: [ 3, 160, 160 ]
}
]
output [
{
name: “output”
data_type: TYPE_FP32
dims: [ 128 ]
}
]
instance_group [
{
kind: KIND_GPU
count: 1
gpus: 0
}
]

[config-file]
infer_config {
unique_id: 2
gpu_ids: [0]
max_batch_size: 1
backend {
trt_is {
model_name: “facenet_trt”
version: -1
model_repo {
root: “.”
log_level: 2
strict_model_config:false
tf_gpu_memory_fraction: 0.2
tf_disable_soft_placement: 0
}
}
}
preprocess {
network_format: IMAGE_FORMAT_RGB
tensor_order: TENSOR_ORDER_NONE
maintain_aspect_ratio: 1
normalize {
scale_factor: 0.0039215697906911373
channel_offsets: [0, 0, 0]
}
}
postprocess {
other {}
}
extra {
copy_input_to_host_buffers: false
}
}
input_control {
process_mode: PROCESS_MODE_CLIP_OBJECTS
operate_on_gie_id: 1
operate_on_class_ids: 0
async_mode:0
interval: 0
}
output_control {
output_tensor_meta: true
}

I still get the same behavior where the deepstream app freezes.
Guess this is a bug in deepstream?

When you run facenet as secondary classifier who is the primary detector? Can you post the PGIE’s configuration?

I’ve tested to use both the detectnet for faces provided by nvidia and also the default detector in test1.
Here is a complete example of code, models and configuration files (GitHub - dangraf/deepstreambug)

here is the configuration file for the pgie (deepstreambug/dstest1_pgie_config.txt at main · dangraf/deepstreambug · GitHub). it’s the original file except i have changed to absolute paths for the detector

Can you set “network-type=100” and try? For your deepstream_test_1_nvinfer_p.py setting, I think there will be no output from your face model. And for the sgie deepstreambug/sgie_secondary.txt at main · dangraf/deepstreambug · GitHub, it is better to set the following:
input-object-min-width=10
input-object-min-height=10

I’ve tried to set network type to 100 for primary which works,
I’ve also tried to set network type to 100 and obj-min-width as suggested on secondary but the app still crashes.

I’ve not verified the output from the primary but I have verified the output from the secondary and it can access the metadata and print out the size of the output in the terminal for the first detected object. But I’ve never printed out two of these detections.

It also seems like the program just get stuck when using the provided files, but If I add a tracker and set async-mode to 1 this changes to segmentation fault.

I’ve also monitored the memory consumption and can see that it’s pretty constant at about 4gb. This seems like the program is writing somehere in memory that causes the segmentation fault or sometimes make the program to just crash.

Any other suggestions?
Do you still think the problem is in my code?

Please refer to deepstream_python_apps/deepstream_test_2.py at master · NVIDIA-AI-IOT/deepstream_python_apps · GitHub for PGIE+SGIE usage.

I do not understand what you are trying to suggest by your link. Yes I have tested that example and it works.
One big difference is that tensor-output-meta is enabled in my example which seems to cause the problem.
Anyway, I shouldn’t get any segmentation fault or application hangup without any warnings even if I set the configuration file a bit wrong. right?

No. If there is something wrong with the configuration, the may be some error.

The sample of enable tensor meta is deepstream_python_apps/apps/deepstream-ssd-parser at master · NVIDIA-AI-IOT/deepstream_python_apps · GitHub

Or you can share us your model to reproduce the problem.

my model is located here:
and the project you are refering to enables the tensor-meta-output in the PRIMARY classifier, which works in my example above. But when enabled as SECONDARY classifier, it fails.

The model is not provided by Nvidia.
I have modified deepstream_python_apps/apps/deepstream-test2 at master · NVIDIA-AI-IOT/deepstream_python_apps · GitHub to support tensor-output with SGIE. Please refer to the modified code and new config file for sgie. deepstream_test_2.py (14.1 KB) dstest2_sgie1_config.txt (3.6 KB)

The provided files does not work, it’s complaining about the tracker.

So I just used test2 and changed some parameters in the configuration file.
It works as if it is run as a clean example

Test1:

output-tensor-meta=1: Works!

Test2:

output-tensor-meta=1
#is-classifier=1: Segmentation fault

what does “is-classifier” do? I can’t find it in the documentation for nvinfer
I thought parameter was replaced by “network-type” which tells if it’s a classification, segmentation or other task

Test3
The test you provided is using a precompiled 8-bit version of a network and I’m using a 16-bit onnx file. So i changed the model to densenet_onnx.

onnx-file=…/…/…/…/samples/trtis_model_repo/densenet_onnx/1/model.onnx
labelfile-path=…/…/…/…/samples/trtis_model_repo/densenet_onnx/densenet_labels.txt
network-mode=2
is-classifier=1
output-tensor-meta=1

full config:

[property]
gpu-id=0
net-scale-factor=1
onnx-file=…/…/…/…/samples/trtis_model_repo/densenet_onnx/1/model.onnx
labelfile-path=…/…/…/…/samples/trtis_model_repo/densenet_onnx/densenet_labels.txt
batch-size=16
#0=FP32 and 1=INT8 mode
network-mode=2
input-object-min-width=64
input-object-min-height=64
process-mode=2
model-color-format=1
gpu-id=0
gie-unique-id=2
operate-on-gie-id=1
operate-on-class-ids=0
is-classifier=1
output-blob-names=predictions/Softmax
classifier-async-mode=1
classifier-threshold=0.51
process-mode=2
#scaling-filter=0
output-tensor-meta=1

This test fails:

InferContextImpl::deserializeEngineAndBackend() <nvdsinfer_context_impl.cpp:1702> [UID = 1]: deserialized trt engine from :/opt/nvidia/deepstream/deepstream-5.1/samples/models/Primary_Detector/resnet10.caffemodel_b1_gpu0_int8.engine
INFO: [Implicit Engine Info]: layers num: 3
0 INPUT kFLOAT input_1 3x368x640
1 OUTPUT kFLOAT conv2d_bbox 16x23x40
2 OUTPUT kFLOAT conv2d_cov/Sigmoid 4x23x40

0:02:22.976871217 167 0x241d8150 INFO nvinfer gstnvinfer.cpp:619:gst_nvinfer_logger: NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::generateBackendContext() <nvdsinfer_context_impl.cpp:1806> [UID = 1]: Use deserialized engine model: /opt/nvidia/deepstream/deepstream-5.1/samples/models/Primary_Detector/resnet10.caffemodel_b1_gpu0_int8.engine
0:02:22.979857810 167 0x241d8150 INFO nvinfer gstnvinfer_impl.cpp:313:notifyLoadModelStatus: [UID 1]: Load new model:dstest2_pgie_config.txt sucessfully
NvMMLiteOpen : Block : BlockType = 261
NVMEDIA: Reading vendor.tegra.display-size : status: 6
NvMMLiteBlockCreate : Block : BlockType = 261
KLT Tracker Init
Frame Number=0 Number of Objects=5 Vehicle_count=3 Person_count=2
Segmentation fault (core dumped)

Are you able to create a configuration file for the onnx model with tensor-meta-output?

Fiona.
This way of debugging this system seems very inefficient, a week have passed and it’s still not solved.
My first question was “How do I proceed with the debugging?” because I would like to understand why things are going wrong.

But your answers is about I should test different things and you are giving med code-examples that do not reflect the problem that I’m trying to solve.
You are asking questions about debug messages, which model I have etc and I answer. It would be helpful to understand what your conclusions are from the answers I give you. For example. you are telling me to share the configuration files. You have complete projects and source code. Could you please point out the errors I have made? You have been asking twice about the model and you have the source code. Is there a problem with my model, I would like to know. You tell me that I have errors in my project, where are they?
It seems like this question is stuck between you and me you don’t provide me the information I need to continue my debugging.
Can I buy hours from a Nvidia developer to solve this problem for me? sure!
I need this problem solved. Please help me.

The code I provided can work on my Jetson board. If it can not work on your board, there may be some other problems with your environment.

There is no special debugging method with deepstream. Even you run the python script, the working part is implemented in c/c++, you can debug the c/c++ part if you think the problem is caused by deepstream. But currently what you have posted shows that it is not deepstream problem.

You are trying to use deepstream with customized models(the model is none of the supported type of nvinfer). nvinfer can support customization, but it needs the user to be familiar with the working mechanism of nvinfer. nvinfer is implemneted in c/c++, the code is open source.

nvinfer supports clssifier, detector, segmentation and instance segmentation Image Classification — Transfer Learning Toolkit 3.0 documentation (nvidia.com). If your model is none of them, you can not trigger any default processing or else it may meet unkown problem.

For test2 sample, with SGIE, if you want to set “output-tensor-meta=1”, that means you want to export the model output outside, so you can not trigger any internal processing with the model output. “network-type=100” should be set in this case to tell nvinfer do not process the model output. “is-classifier” is not needed if “network-type” has been set correctly.

nvinfer default classifier has only one output layer but your model has two output layers, so your model is not supported by default classifier processor, you can not enable “is-classifier” with your model.

tell me how I can debug the c/c++ parts in my python app. Can I use gdb? I can only find how to dump out data in the consol(Basic tutorial 11: Debugging tools)

" But currently what you have posted shows that it is not deepstream problem."
You are again implying that I have errors in my code without pointing out which they are.
For example this settings file which have network-type=“other” or 100. It still crashes.

I ask for A and you answer B. Please read the questions before you post.

I have sent you the code. The code can work.

you have an error either in the code or configuration file. ;)

I’ts complaining about the tracking file which you have not provided when running these as a separate project. (do you see the difference, I point out the exact error that was made so you can fix it if you want.)
It works when I copy the files to the test2 folder as explained in the post from Mar 29.

And I have also pointed out problems with your configuration file because you are running it as a precompiled 8-bit engine model.

As explained in test3 where I use an onnx-model in 16 bit mode as a classifier (because detectnet is a classifier) and enabling metadata output, it still crashes. This model have one output layer.

You have still not provided any examples/explanation showing that deepstream is working correctly

What kind of model? ONNX? Can you put the crash log without GST_DEBUG setting?

For your test3, the model has two output layers, you need to set “network-type=100” and remove “is-classifier=1”. In you log, the input layer dimension is 1x368x640, so you should set “input-object-min-width=40 input-object-min-height=23” but not “input-object-min-width=64 input-object-min-height=64”

Thanks!
You are telling me the network has two output layers. How do I inspect that? Because it seems like it has one output layer (name: “fc6_1)” which are the classes (dims: 1000). What is the other output layer?

and my second question is about the object dimensions.Do you mean that the ratio between input dimentions W & H need to be the same for input-object-min-with/height?
because I understood it as it only matters if the “maintain-aspect-ratio” flag is set to 1.

“Indicates whether to maintain aspect ratio while scaling input. DeepStream currently does asymmetric padding only.”