Caffe Model (and others) Output-Blob-Name Options

zak4 · August 23, 2021, 3:52pm

I am currently working on getting an SSD caffe model running in deepstream. It has been converted to a TensorRT engine for tegra based platforms.

• Hardware Platform: Jetson TX2/Xavier (Currently working on TX2)
• DeepStream Version: 5.0
• JetPack Version: 4.4
• TensorRT Version: 7.1.3
• Issue Type: Error when trying to run TensorRT engine in Deepstream on TX2 platform.

Steps Completed:
Converted Caffe SSD model into a TensorRT engine
Compiled a new updated version and replaced the old version of “libnvinfer_plugin.so.7.1.3”
Compiled and linked in the config file “libnvds_infercustomparser_tlt.so”

Current Error:

Mismatch in the number of output buffers.Expected 2 output buffers, detected in the network :1
0:00:09.304585054 25 0x559e8cb680 ERROR nvinfer gstnvinfer.cpp:613:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Error in NvDsInferContextImpl::fillDetectionOutput() <nvdsinfer_context_impl_output_parsing.cpp:725> [UID = 1]: Failed to parse bboxes using custom parse function

SSD Output Layer (Caffe prototxt):

layer {
name: “detection_out”
type: “DetectionOutput”
bottom: “mbox_loc”
bottom: “mbox_conf_flatten”
bottom: “mbox_priorbox”
top: “detection_out”
include {
phase: TEST
}
detection_output_param {
num_classes: 21
share_location: true
background_label_id: 0
nms_param {
nms_threshold: 0.45
top_k: 100
}
code_type: CENTER_SIZE
keep_top_k: 100
confidence_threshold: 0.25
}
}

This is all running in a customized deepstream container on the TX2 platform. We currently have no problems running Detectnet models on the platform, and have completed similar steps to run YOLO on a dGPU setup. I do not see the last layer as having a “BatchedNMS” or “NMS” output like reference in the YOLO and SSD deepstream app config files. Is there a list of available output blob names, or a way to find what the appropriate one to use is in this case?

mchi · August 24, 2021, 7:56am

You need to customize the post-processor for your SSD.
You can refer to below two links about how to add your own post-processor

https://github.com/NVIDIA-AI-IOT/deepstream_tlt_apps/blob/master/post_processor/nvdsinfer_custombboxparser_tlt.cpp#L246 

https://github.com/NVIDIA-AI-IOT/deepstream_tlt_apps/blob/master/configs/ssd_tlt/pgie_ssd_tlt_config.txt#L46 ~147

zak4 · August 25, 2021, 3:53pm

Thanks for the update. I have decided to approach this a different way, and use the “sample_ssd” program to generate a new engine file with the required prototxt changes to include a second output tensor. I am however at a loss as to how to export the engine file once created. I’ve searched through the TensorRT C++ API docs and could not find a function to export or save the created engine file to disk. Would you have any insight as to how to go about doing that?

mchi · September 9, 2021, 2:47pm

Sorry for delay! you can refer to below code in https://developer.nvidia.com/blog/speed-up-inference-tensorrt/ about how to serialize() and save Host buffer into a file, and conversely, read the file and deserializeCudaEngine() to engine

ICudaEngine* getCudaEngine(string const& onnxModelPath, int batchSize)
{
    string enginePath{getBasename(onnxModelPath) + "_batch" + to_string(batchSize) + ".engine"};
    ICudaEngine* engine{nullptr};

    string buffer = readBuffer(enginePath);
    if (buffer.size())
    {
        // try to deserialize engine
        unique_ptr<IRuntime, Destroy> runtime{createInferRuntime(gLogger)};
        engine = runtime->deserializeCudaEngine(buffer.data(), buffer.size(), nullptr);
    }

    if (!engine)
    {
        // Fallback to creating engine from scratch
        engine = createCudaEngine(onnxModelPath, batchSize);

        if (engine)
        {
            unique_ptr<IHostMemory, Destroy> engine_plan{engine->serialize()};
            // try to save engine for future uses
            writeBuffer(engine_plan->data(), engine_plan->size(), enginePath);
        }
    }
    return engine;
}

zak4 · September 9, 2021, 2:58pm

No worries!
I actually solved it last week using the sample_SSD c++ example. I added

nvinfer1::IHostMemory *trtModelStream = mEngine->serialize();
std::ofstream p("SSD_engine.engine");
p.write((const char*)trtModelStream->data(),trtModelStream->size());
p.close();

to the build engine function. This was added towards the end before the engine was returned and saves the engine to disk before continuing on to testing.
Thanks for the update!

Topic		Replies	Views
How to generate a tensorrt model that is supported by Deesptream sdk DeepStream SDK	17	569	January 29, 2024
Using SSD Mobilenet in Deepstream 6.2 DeepStream SDK	13	735	August 10, 2023
Failed to parse onnx file DeepStream SDK	11	522	March 26, 2024
Failed to run the example(deepstream_image_meta_test) DeepStream SDK	12	452	January 26, 2022
Tensorflow object detection api 2.x model in deepstream 6.0 DeepStream SDK	4	1001	December 28, 2021
ERROR: [TRT]: 10: Could not find any implementation for node /0/model.24/Range DeepStream SDK	12	489	July 9, 2024
Port SSD_Mobilenet_V2.pb to be used in Deepstream 5.0? DeepStream SDK	3	571	October 12, 2021
Incompatible TensorRT engine(int8) with deepstream DeepStream SDK	6	49	December 25, 2024
How to Deepstream TensortRT version upgrade DeepStream SDK tensorrt , cuda	26	679	July 4, 2023
Error while building deepstream_tlt_apps TAO Toolkit tensorrt	11	3912	October 12, 2021

Caffe Model (and others) Output-Blob-Name Options

Related topics