OTA Update in deepstream-test5 model update segmentation fault

Please provide complete information as applicable to your setup.

**• Hardware Platform (Jetson / GPU): Jetson
• DeepStream Version: 6.0
• JetPack Version (valid for Jetson only): 4.6
• TensorRT Version: 8.0.1
• NVIDIA GPU Driver Version (valid for GPU only): NA
• Issue Type( questions, new requirements, bugs): Questions

Hello,

We are trying to use the Deepstream Test5 app to do OTA model updates. I’ve tested the example that was provided and it works perfectly. I then decided to test the example using our custom model which is YOLO_V4. However, when I run the example and use our model instead of the provided resnet10 caffe model, I get the error below. Here, the model does update, but then I get a segmentation fault and an error saying that the model failed to parse bounding boxes.

I saw the following post, and saw that the segmentation fault is caused by the custom bbox parser. I was wondering if that issue had been resolved yet.

Thank you in advance!

./deepstream-test5-app -c …/sources/apps/sample_apps/deepstream-test5/configs/test5_config_file_src_infer.txt -o …/sources/apps/sample_apps/deepstream-test5/configs/test5_ota_override_config.txt
REAL PATH = /opt/nvidia/deepstream/deepstream-6.0/sources/apps/sample_apps/deepstream-test5/configs/test5_ota_override_config.txt

Using winsys: x11
gstnvtracker: Loading low-level lib at /opt/nvidia/deepstream/deepstream-6.0/lib/libnvds_nvmultiobjecttracker.so
gstnvtracker: Batch processing is ON
gstnvtracker: Past frame output is ON
[NvMultiObjectTracker] Initialized
0:00:04.653483136 10534 0x78d2070 INFO nvinfer gstnvinfer.cpp:638:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::deserializeEngineAndBackend() <nvdsinfer_context_impl.cpp:1900> [UID = 1]: deserialized trt engine from :/opt/nvidia/deepstream/deepstream-6.0/samples/models/model_test/release/model.etlt_b1_gpu0_int8.engine
INFO: [Implicit Engine Info]: layers num: 5
0 INPUT kFLOAT Input 3x736x1280
1 OUTPUT kINT32 BatchedNMS 1
2 OUTPUT kFLOAT BatchedNMS_1 200x4
3 OUTPUT kFLOAT BatchedNMS_2 200
4 OUTPUT kFLOAT BatchedNMS_3 200

0:00:04.653787552 10534 0x78d2070 INFO nvinfer gstnvinfer.cpp:638:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::generateBackendContext() <nvdsinfer_context_impl.cpp:2004> [UID = 1]: Use deserialized engine model: /opt/nvidia/deepstream/deepstream-6.0/samples/models/model_test/release/model.etlt_b1_gpu0_int8.engine
0:00:04.695085248 10534 0x78d2070 INFO nvinfer gstnvinfer_impl.cpp:313:notifyLoadModelStatus:<primary_gie> [UID 1]: Load new model:/opt/nvidia/deepstream/deepstream-6.0/samples/models/model_test/release/infer_config.txt sucessfully

Runtime commands:
h: Print this help
q: Quit

p: Pause
r: Resume

NOTE: To expand a source in the 2D tiled display and view object details, left-click on the source.
To go back to the tiled display, right-click anywhere on the window.

** INFO: <bus_callback:194>: Pipeline ready

Opening in BLOCKING MODE
Opening in BLOCKING MODE
Opening in BLOCKING MODE
NvMMLiteOpen : Block : BlockType = 261
NvMMLiteOpen : Block : BlockType = 261
Opening in BLOCKING MODE
NvMMLiteOpen : Block : BlockType = 261
NvMMLiteOpen : Block : BlockType = 261
NVMEDIA: Reading vendor.tegra.display-size : status: 6
NVMEDIA: Reading vendor.tegra.display-size : status: 6
NvMMLiteBlockCreate : Block : BlockType = 261
NVMEDIA: Reading vendor.tegra.display-size : status: 6
NVMEDIA: Reading vendor.tegra.display-size : status: 6
NvMMLiteBlockCreate : Block : BlockType = 261
NvMMLiteBlockCreate : Block : BlockType = 261
NvMMLiteBlockCreate : Block : BlockType = 261
** INFO: <bus_callback:180>: Pipeline running

**PERF: FPS 0 (Avg) FPS 1 (Avg) FPS 2 (Avg) FPS 3 (Avg)
Mon May 2 17:27:15 2022
**PERF: 0.00 (0.00) 0.00 (0.00) 0.00 (0.00) 0.00 (0.00)
Mon May 2 17:27:20 2022
**PERF: 0.00 (0.00) 0.00 (0.00) 0.00 (0.00) 0.00 (0.00)
WARNING; playback mode used with URI [file:/opt/nvidia/deepstream/deepstream-6.0/samples/streams/sample_1080p_h264.mp4] not conforming to timestamp format; check README; using system-time
WARNING; playback mode used with URI [file:/opt/nvidia/deepstream/deepstream-6.0/samples/streams/sample_1080p_h264.mp4] not conforming to timestamp format; check README; using system-time
WARNING; playback mode used with URI [file:/opt/nvidia/deepstream/deepstream-6.0/samples/streams/sample_1080p_h264.mp4] not conforming to timestamp format; check README; using system-time
WARNING; playback mode used with URI [file:/opt/nvidia/deepstream/deepstream-6.0/samples/streams/sample_1080p_h264.mp4] not conforming to timestamp format; check README; using system-time
Mon May 2 17:27:25 2022
**PERF: 7.20 (7.12) 7.20 (7.12) 7.20 (7.12) 7.20 (7.12)
File test5_ota_override_config.txt modified.

New Model Update Request primary_gie ----> /opt/nvidia/deepstream/deepstream-6.0/samples/models/model_test/release/model.etlt_b1_gpu0_int8.engine
0:00:16.550898944 10534 0x7ea41ca960 WARN nvinfer gstnvinfer.cpp:635:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Warning from NvDsInferContextImpl::initialize() <nvdsinfer_context_impl.cpp:1161> [UID = 1]: Warning, OpenCV has been deprecated. Using NMS for clustering instead of cv::groupRectangles with topK = 20 and NMS Threshold = 0.5
0:00:17.250502496 10534 0x7ea41ca960 INFO nvinfer gstnvinfer.cpp:638:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::deserializeEngineAndBackend() <nvdsinfer_context_impl.cpp:1900> [UID = 1]: deserialized trt engine from :/opt/nvidia/deepstream/deepstream-6.0/samples/models/model_test/release/model.etlt_b1_gpu0_int8.engine
INFO: [Implicit Engine Info]: layers num: 5
0 INPUT kFLOAT Input 3x736x1280
1 OUTPUT kINT32 BatchedNMS 1
2 OUTPUT kFLOAT BatchedNMS_1 200x4
3 OUTPUT kFLOAT BatchedNMS_2 200
4 OUTPUT kFLOAT BatchedNMS_3 200

0:00:17.250732832 10534 0x7ea41ca960 INFO nvinfer gstnvinfer.cpp:638:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::generateBackendContext() <nvdsinfer_context_impl.cpp:2004> [UID = 1]: Use deserialized engine model: /opt/nvidia/deepstream/deepstream-6.0/samples/models/model_test/release/model.etlt_b1_gpu0_int8.engine
0:00:17.607271104 10534 0x7194cf0 INFO nvinfer gstnvinfer_impl.cpp:313:notifyLoadModelStatus:<primary_gie> [UID 1]: Load new model:/opt/nvidia/deepstream/deepstream-6.0/samples/models/model_test/release/model.etlt_b1_gpu0_int8.engine sucessfully

Model Update Status: Updated model : /opt/nvidia/deepstream/deepstream-6.0/samples/models/model_test/release/model.etlt_b1_gpu0_int8.engine, OTATime = 1060.202000 ms, result: ok

0:00:17.665835872 10534 0x6eba6d0 ERROR nvinfer gstnvinfer.cpp:632:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Error in NvDsInferContextImpl::parseBoundingBox() <nvdsinfer_context_impl_output_parsing.cpp:59> [UID = 1]: Could not find output coverage layer for parsing objects
0:00:17.665992064 10534 0x6eba6d0 ERROR nvinfer gstnvinfer.cpp:632:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Error in NvDsInferContextImpl::fillDetectionOutput() <nvdsinfer_context_impl_output_parsing.cpp:735> [UID = 1]: Failed to parse bboxes
Segmentation fault (core dumped)

test5_config_file_src_infer.txt (6.3 KB)

test5_ota_override_config.txt (2.2 KB)

Hi @megan.e.morrison
Can you check if your model update meets the " Assumption for On-The-Fly model updates :" in DeepStream Reference Application - deepstream-test5 app — DeepStream 6.1 Release documentation ?

Hi. Yes our model update meets the “Assumption for On-The-Fly model updates”. I’ve tested this app both with updating the model, and not updating the model but adding a newline to the test5_ota_override_config.txt file (which was suggested in another post), and get the same error regardless.

Hi @megan.e.morrison ,
From this error, it failed on code below.
Which means “if (strstr(outputLayersInfo[i].layerName, “bbox”) != nullptr)” failed, could you add some debug code in /opt/nvidia/deepstream/deepstream/sources/libs/nvdsinfer to check why there is not a output layer named “bbox” ?

File: /opt/nvidia/deepstream/deepstream/sources/libs/nvdsinfer/nvdsinfer_context_impl_output_parsing.cpp

bool
DetectPostprocessor::parseBoundingBox(vector<NvDsInferLayerInfo> const& outputLayersInfo,
    NvDsInferNetworkInfo const& networkInfo,
    NvDsInferParseDetectionParams const& detectionParams,
    vector<NvDsInferObjectDetectionInfo>& objectList)
{

    int outputCoverageLayerIndex = -1;
    int outputBBoxLayerIndex = -1;


    for (unsigned int i = 0; i < outputLayersInfo.size(); i++)
    {
        if (strstr(outputLayersInfo[i].layerName, "bbox") != nullptr)
        {
            outputBBoxLayerIndex = i;
        }
        if (strstr(outputLayersInfo[i].layerName, "cov") != nullptr)
        {
            outputCoverageLayerIndex = i;
        }
    }

    if (outputCoverageLayerIndex == -1)
    {
        printError("Could not find output coverage layer for parsing objects");
        return false;
    }
    if (outputBBoxLayerIndex == -1)
    {
        printError("Could not find output bbox layer for parsing objects");
        return false;
    }
  ...

Hi @mchi ,

Thank you for pointing that out. I think I have a good idea of what is going on with this error.

It looks like the function DetectPostprocessor::parseBoundingBox(vector<NvDsInferLayerInfo> const& outputLayersInfo, NvDsInferNetworkInfo const& networkInfo, NvDsInferParseDetectionParams const& detectionParams, vector<NvDsInferObjectDetectionInfo>& objectList)

is assuming the detector is a resnet10 model (aka what the example provided uses). However, I am using a YOLO_v4 model, so the output layers are different than what the code is looking for. For instance, for the resnet10 model the output layers needed are bbox and conv, but with my YOLO_v4 model the output layers are BatchedNMS, BatchedNMS_1, BatchedNMS_2, and BatchedNMS_3.

The comment before this function also states that the function was written specifically for the sample resnet10 model that’s provided.

I can go through this function and make updates so it works with the YOLO_v4 layers rather than the resnet10 layers. Or, would it be better to write a custom output parser?

I’m wondering why it can run successfully before model OTA if you are runing YoloV4 with default post-processor in DS.

Honestly I am wondering this too.

Anyway, I think you need to have your own YoloV4 post-processor.

GitHub - NVIDIA-AI-IOT/yolov4_deepstream can be a reference about DS Yolov4

Hi @mchi,

Is using this post-processor necessary when I am using a .etlt model file from TAO, and converting it to a .engine file using the tao-converter? I looked into the code/the Github page provided and saw that this is used with what seems to be models that are trained outside of TAO. Please correct me if I am wrong on this.

I was using this Github page to create my custom bounding box parser for YOLO_V4.

@mchi , I have found a solution to this problem! I will post my solution a little later today.

The solution is as follows:

I noticed that upon the initialization of the model and starting up the Deepstream app, the model file being uploaded was our infer_config.txt for the primary model. However, when OTA was triggered, it was trying to upload the .engine file instead of the infer_config.txt (see bold sections below):

clear; ./deepstream-test5-app -c configs/test5_config_file_src_infer.txt -o configs/test5_ota_override_config.txt

override cfg file [i] configs/test5_ota_override_config.txtREAL PATH = /opt/nvidia/deepstream/deepstream-6.0/sources/apps/sample_apps/deepstream-test5/configs/test5_ota_override_config.txt

Using winsys: x11
gstnvtracker: Loading low-level lib at /opt/nvidia/deepstream/deepstream-6.0/lib/libnvds_nvmultiobjecttracker.so
gstnvtracker: Batch processing is ON
gstnvtracker: Past frame output is ON
[NvMultiObjectTracker] Initialized
WARNING: [TRT]: Using an engine plan file across different models of devices is not recommended and is likely to affect performance or even cause errors.
0:00:04.758396960 13546 0x55a2f14e70 INFO nvinfer gstnvinfer.cpp:638:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::deserializeEngineAndBackend() <nvdsinfer_context_impl.cpp:1905> [UID = 1]: deserialized trt engine from :/opt/nvidia/deepstream/deepstream-6.0/samples/models/model_test/release/model.etlt_b1_gpu0_int8.engine
INFO: [Implicit Engine Info]: layers num: 5
0 INPUT kFLOAT Input 3x736x1280
1 OUTPUT kINT32 BatchedNMS 1
2 OUTPUT kFLOAT BatchedNMS_1 200x4
3 OUTPUT kFLOAT BatchedNMS_2 200
4 OUTPUT kFLOAT BatchedNMS_3 200

0:00:04.758766944 13546 0x55a2f14e70 INFO nvinfer gstnvinfer.cpp:638:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::generateBackendContext() <nvdsinfer_context_impl.cpp:2009> [UID = 1]: Use deserialized engine model: /opt/nvidia/deepstream/deepstream-6.0/samples/models/model_test/release/model.etlt_b1_gpu0_int8.engine
0:00:05.240483424 13546 0x55a2f14e70 INFO nvinfer gstnvinfer_impl.cpp:313:notifyLoadModelStatus:<primary_gie> [UID 1]: Load new model:/opt/nvidia/deepstream/deepstream-6.0/samples/models/model_test/release/infer_config.txt sucessfully

Runtime commands:
h: Print this help
q: Quit

p: Pause
r: Resume

NOTE: To expand a source in the 2D tiled display and view object details, left-click on the source.
To go back to the tiled display, right-click anywhere on the window.

** INFO: <bus_callback:194>: Pipeline ready

**PERF: FPS 0 (Avg) FPS 1 (Avg) FPS 2 (Avg) FPS 3 (Avg)
Thu Jun 16 16:53:03 2022
**PERF: 0.00 (0.00) 0.00 (0.00) 0.00 (0.00) 0.00 (0.00)
Opening in BLOCKING MODE
Opening in BLOCKING MODE
NvMMLiteOpen : Block : BlockType = 261
NvMMLiteOpen : Block : BlockType = 261
NVMEDIA: Reading vendor.tegra.display-size : status: 6
NVMEDIA: Reading vendor.tegra.display-size : status: 6
NvMMLiteBlockCreate : Block : BlockType = 261
Opening in BLOCKING MODE
Opening in BLOCKING MODE
NvMMLiteBlockCreate : Block : BlockType = 261
NvMMLiteOpen : Block : BlockType = 261
NvMMLiteOpen : Block : BlockType = 261
NVMEDIA: Reading vendor.tegra.display-size : status: 6
NVMEDIA: Reading vendor.tegra.display-size : status: 6
NvMMLiteBlockCreate : Block : BlockType = 261
NvMMLiteBlockCreate : Block : BlockType = 261
** INFO: <bus_callback:180>: Pipeline running

Thu Jun 16 16:53:08 2022
**PERF: 0.00 (0.00) 0.00 (0.00) 0.00 (0.00) 0.00 (0.00)
WARNING; playback mode used with URI [file:/opt/nvidia/deepstream/deepstream-6.0/samples/streams/sample_1080p_h264.mp4] not conforming to timestamp format; check README; using system-time
WARNING; playback mode used with URI [file:/opt/nvidia/deepstream/deepstream-6.0/samples/streams/sample_1080p_h264.mp4] not conforming to timestamp format; check README; using system-time
WARNING; playback mode used with URI [file:/opt/nvidia/deepstream/deepstream-6.0/samples/streams/sample_1080p_h264.mp4] not conforming to timestamp format; check README; using system-time
WARNING; playback mode used with URI [file:/opt/nvidia/deepstream/deepstream-6.0/samples/streams/sample_1080p_h264.mp4] not conforming to timestamp format; check README; using system-time
Thu Jun 16 16:53:13 2022
**PERF: 6.85 (6.54) 6.85 (6.54) 6.85 (6.54) 7.09 (6.96)
Thu Jun 16 16:53:18 2022
**PERF: 6.32 (6.37) 6.32 (6.37) 6.32 (6.37) 6.25 (6.56)
File test5_ota_override_config.txt modified.

New Model Update Request primary_gie ----> /opt/nvidia/deepstream/deepstream-6.0/samples/models/model_test/release/model.etlt_b1_gpu0_int8.engine
0:00:22.910550464 13546 0x7e7c00c410 WARN nvinfer gstnvinfer.cpp:635:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Warning from NvDsInferContextImpl::initialize() <nvdsinfer_context_impl.cpp:1166> [UID = 1]: Warning, OpenCV has been deprecated. Using NMS for clustering instead of cv::groupRectangles with topK = 20 and NMS Threshold = 0.5
WARNING: [TRT]: Using an engine plan file across different models of devices is not recommended and is likely to affect performance or even cause errors.
0:00:23.868270144 13546 0x7e7c00c410 INFO nvinfer gstnvinfer.cpp:638:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::deserializeEngineAndBackend() <nvdsinfer_context_impl.cpp:1905> [UID = 1]: deserialized trt engine from :/opt/nvidia/deepstream/deepstream-6.0/samples/models/model_test/release/model.etlt_b1_gpu0_int8.engine
INFO: [Implicit Engine Info]: layers num: 5
0 INPUT kFLOAT Input 3x736x1280
1 OUTPUT kINT32 BatchedNMS 1
2 OUTPUT kFLOAT BatchedNMS_1 200x4
3 OUTPUT kFLOAT BatchedNMS_2 200
4 OUTPUT kFLOAT BatchedNMS_3 200

0:00:23.870277984 13546 0x7e7c00c410 INFO nvinfer gstnvinfer.cpp:638:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::generateBackendContext() <nvdsinfer_context_impl.cpp:2009> [UID = 1]: Use deserialized engine model: /opt/nvidia/deepstream/deepstream-6.0/samples/models/model_test/release/model.etlt_b1_gpu0_int8.engine
0:00:24.319700896 13546 0x55a27d8cf0 INFO nvinfer gstnvinfer_impl.cpp:313:notifyLoadModelStatus:<primary_gie> [UID 1]: Load new model:/opt/nvidia/deepstream/deepstream-6.0/samples/models/model_test/release/model.etlt_b1_gpu0_int8.engine sucessfully

Model Update Status: Updated model : /opt/nvidia/deepstream/deepstream-6.0/samples/models/model_test/release/model.etlt_b1_gpu0_int8.engine, OTATime = 1410.254000 ms, result: ok

0:00:24.377393952 13546 0x55a24fced0 ERROR nvinfer gstnvinfer.cpp:632:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Error in NvDsInferContextImpl::parseBoundingBox() <nvdsinfer_context_impl_output_parsing.cpp:95> [UID = 1]: Could not find output coverage layer for parsing objects
0:00:24.377490304 13546 0x55a24fced0 ERROR nvinfer gstnvinfer.cpp:632:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Error in NvDsInferContextImpl::fillDetectionOutput() <nvdsinfer_context_impl_output_parsing.cpp:771> [UID = 1]: Failed to parse bboxes

In order to fix this, I changed 3 lines of code in the deepstream_test5_app_main.c file.

On line 1173, I added:

gchar *config_file_path =
        ota_appCtx->override_config.primary_gie_config.config_file_path;

On line 1181 I commented out the following:

//g_object_set (G_OBJECT (primary_gie), "model-engine-file",
          model_engine_file_path, NULL);

On line 1184, I added:

      g_object_set (G_OBJECT (primary_gie), "config-file-path",
          config_file_path, NULL);

Test5 app now pulls in the infer_config.txt file instead of the .engine file and OTA works as expected.