run multiple uff models in a c++ program

Hi, i am trying to run two uff models in a c++ program but i get a error, i list what i have done and tried as below.

  1. i have two models A and B and i wrap them into Class A and Class B base on "sampleUffSSD" example code (Parsing and building TensorRT network).
  2. i have run Class A in a C++ program and the result is correct.
  3. i have run Class B in a C++ program and there is no error (the input of Class B is the output of Class A, so i am not sure the result of Class B is correct or not but at least no Error occurs).
  4. parse and build Class A and B in a c++ program, the first one (no matter A or B) can be parsed and built but the second one will fail at model parsing.
Begin parsing A model...
End parsing A model...
Begin building A engine...
End building B engine...
*** Deserializing ***
Begin parsing B model...
Segmentation fault (core dumped)

Does anyone can tell me how to run two TensorRT networks(from Uff model) in a C++ program.

Thanks.

Hi,

It should work like the way you described.

Could you help to monitor the system status with tegrastats first.
Maybe you are running out of memory.

sudo tegrastats

Thanks.

@AastaLLL, Thanks for the answer

you are right, it is running out of memory.

i try to modify MaxWorkspaceSize for A is 100_MB and for B is 10_MB to reduce memory usage, i am not sure is it the right way but it works, now i can run both A and B in a C++ program.

Now i have another issue, when i run Model A only the prediction of A is correct but if i run both A and B the prediction of A is wrong, do you have any idea what would cause this problem ?

here are more descriptions for my situation:

  1. For both Class A and B use the same ILogger.
  2. A is a modified light weight YoLo model which can detect 7 objects.
  3. when the program is executed with running Class A (context->execute) only, all objects can be detected.
  4. when the program is executed with running both Class A and B, only 3~4 objects can be detected (the predicted bounding boxes are correct but some objects in the image can't be detected).
  5. i monitor the system status, the memory usage is like RAM 2719/3957 while running A and B.

Thanks for the response

Hi,

Could you share some information or description of the tensor data pipeline?

It should like this:

[CPU image] -> [GPU image] -> (A) -> [GPU bbox] -> (B) -> [GPU output] -> [CPU output]
                                          |
                                          --> [CPU bbox]

Thanks.

Hi, @AastaLLL Thanks for the anwser.

After tried some debugging, i narrow it down to one suspicious part.

For Model A(YoLo) i write a custom plugin for space to depth function and compile it to shared library (follow “uff_custom_plugin” example).

For Model B i also write a custom plugin for absolute value and compile it to shared library.

Below is part of my program and just run YoLo model only. the data flow like

[CPU image] -> [GPU image] -> (A = YoLo) -> [GPU bbox] -> [CPU bbox]

and there are two cases when i run this program to detect object in the video.

Case 1. If i comment out all abs_plugin parts(as below) the YoLo prediction is good, For example most car and license plate in the video can be detected

Case 2. If i uncomment the commented parts below, the prediction of YoLo is really worse, For car object, the detection rate is almost same as Case 1 but the accuracy of bounding box is worse thane Case 1. For license plate object, the detection rate is more less than Case 1 and the accuracy of bounding box is more worse than Case 1.

#include "trt_plugin/space_to_depth.h"
// #include "trt_plugin/abs_plugin.h"

#include "trt_model/trt_yolo.h"

// Static class fields initialization
nvinfer1::PluginFieldCollection SpaceToDepthCreator::mFC{};
std::vector<nvinfer1::PluginField> SpaceToDepthCreator::mPluginAttributes;

// nvinfer1::PluginFieldCollection AbsPluginCreator::mFC{};
// std::vector<nvinfer1::PluginField> AbsPluginCreator::mPluginAttributes;

I have checked abs_plugin but maybe there is something wrong that i don’t know, i post my code here for your reference if you could check it for me, Thanks

abs_plugin.h

#ifndef _ABS_PLUGIN_H_
#define _ABS_PLUGIN_H_

#include "NvInferPlugin.h"
#include <string>
#include <vector>

#include "abs_kernel.h"

/********************
 ABSPlugin
********************/
class AbsPlugin: public nvinfer1::IPluginV2
{
public:
    // Ctor
    AbsPlugin();

    // Override IPluginV2 public function
    const char *getPluginType () const override;
    
    const char *getPluginVersion () const override;    
    
    int getNbOutputs () const override;
    
    nvinfer1::Dims getOutputDimensions (int index, const nvinfer1::Dims *inputs, int nbInputDims) override;
    
    void configureWithFormat(const nvinfer1::Dims* inputDims, int nbInputs, const nvinfer1::Dims* outputDims, int nbOutputs, nvinfer1::DataType type, nvinfer1::PluginFormat format, int maxBatchSize) override;
    
    bool supportsFormat (nvinfer1::DataType type, nvinfer1::PluginFormat format) const override;
    
    int initialize () override;
    
    void terminate () override;
    
    size_t getWorkspaceSize (int maxBatchSize) const override;
    
    int enqueue (int batchSize, const void *const *inputs, void **outputs, void *workspace, cudaStream_t stream) override;
    
    size_t getSerializationSize () const override;
    
    void serialize (void *buffer) const override;
    
    void destroy () override;
    
    nvinfer1::IPluginV2* clone () const override;
    
    void setPluginNamespace (const char *pluginNamespace) override;

    const char* getPluginNamespace () const override;

private:
    size_t mInputVolume;
    int mSrcDepth;
    int mSrcRow;
    int mSrcCol;
    std::string mNamespace;

};

/********************
 AbsPluginCreator
********************/
class AbsPluginCreator: public nvinfer1::IPluginCreator
{
public:
    AbsPluginCreator();

    const char* getPluginName() const override;

    const char* getPluginVersion() const override;

    const nvinfer1::PluginFieldCollection* getFieldNames() override;

    nvinfer1::IPluginV2* createPlugin(const char* name, const nvinfer1::PluginFieldCollection* fc) override;

    nvinfer1::IPluginV2* deserializePlugin(const char* name, const void* serialData, size_t serialLength) override;

void setPluginNamespace(const char* pluginNamespace) override;
    const char* getPluginNamespace() const override;

private:
    static nvinfer1::PluginFieldCollection mFC;
    static std::vector<nvinfer1::PluginField> mPluginAttributes;
    std::string mNamespace;
};
REGISTER_TENSORRT_PLUGIN(AbsPluginCreator);

#endif // _ABS_PLUGIN_H_

abs_plugin.cpp

#include "abs_plugin.h"
#include "NvInfer.h"
#include <cuda_runtime.h>
#include <cassert>
#include <cstring>

// absolute value plugin specific constants
namespace 
{
    const char* ABS_PLUGIN_NAME{"CM_ABS_TRT"};
    const char* ABS_PLUGIN_VERSION{"1"};
} // namespace

/********************
 AbsPlugin
********************/
// Ctor
AbsPlugin::AbsPlugin()
{
}

// IPluginV2 public function
const char* AbsPlugin::getPluginType() const { return ABS_PLUGIN_NAME; }
const char* AbsPlugin::getPluginVersion() const { return ABS_PLUGIN_VERSION; }

int AbsPlugin::getNbOutputs() const { return 1; }

nvinfer1::Dims AbsPlugin::getOutputDimensions(int index, const nvinfer1::Dims *inputs, int nbInputDims)
{
    // printf("##### %s\n", __func__);
    assert(nbInputDims == 1);
    assert(index == 0);
    assert(inputs[index].nbDims == 3);

    int in_depth = inputs[0].d[0];
    int in_height = inputs[0].d[1];
    int in_width = inputs[0].d[2];

    return nvinfer1::DimsCHW(in_depth, in_height, in_width);
}

void AbsPlugin::configureWithFormat(const nvinfer1::Dims* inputDims, int nbInputs, const nvinfer1::Dims* outputDims, int nbOutputs, nvinfer1::DataType type, nvinfer1::PluginFormat format, int maxBatchSize)
{
    // printf("##### %s\n", __func__);
    assert(nbInputs == 1);
    assert(nbOutputs == 1);
    assert(type == nvinfer1::DataType::kFLOAT);
    assert(format == nvinfer1::PluginFormat::kNCHW);

mInputVolume = inputDims[0].d[0] * inputDims[0].d[1] * inputDims[0].d[2];

}

bool AbsPlugin::supportsFormat(nvinfer1::DataType type, nvinfer1::PluginFormat format) const
{
    // printf("##### %s\n", __func__);
    if(type == nvinfer1::DataType::kFLOAT && format == nvinfer1::PluginFormat::kNCHW) {
        return true;
    }
    else {
        return false;
    }
}

int AbsPlugin::initialize() { return 0; } 

void AbsPlugin::terminate()
{
    // if something was initialized in "initialize()"
    // it must be terminated here
}

size_t AbsPlugin::getWorkspaceSize(int maxBatchSize) const { return 0; }

int AbsPlugin::enqueue(int batchSize, const void *const *inputs, void **outputs, void *workspace, cudaStream_t stream) 
{
    int status = -1;
    status = absInference(stream, inputs[0], outputs[0], mInputVolume);
    return status;
}

size_t AbsPlugin::getSerializationSize() const
{
    // printf("##### %s\n", __func__);
    return 0;
}

void AbsPlugin::serialize(void *buffer) const
{
}

void AbsPlugin::destroy()
{ 
    delete this; 
}

nvinfer1::IPluginV2* AbsPlugin::clone() const
{
    return new AbsPlugin();
}

void AbsPlugin::setPluginNamespace(const char *pluginNamespace)
{
    // printf("##### %s\n", __func__);
    mNamespace = pluginNamespace; 
}

const char* AbsPlugin::getPluginNamespace() const
{
    // printf("##### %s\n", __func__);
    return mNamespace.c_str(); 
}

//==============================================================================================================

/********************
 AbsPluginCreator
********************/
AbsPluginCreator::AbsPluginCreator()
{
}
// AbsPluginCreator::~AbsPluginCreator() {}

const char* AbsPluginCreator::getPluginName() const { return ABS_PLUGIN_NAME; }

const char* AbsPluginCreator::getPluginVersion() const { return ABS_PLUGIN_VERSION; }

const nvinfer1::PluginFieldCollection* AbsPluginCreator::getFieldNames() { return &mFC; }

nvinfer1::IPluginV2* AbsPluginCreator::createPlugin(const char* name, const nvinfer1::PluginFieldCollection* fc)
{
    return new AbsPlugin();
}

nvinfer1::IPluginV2* AbsPluginCreator::deserializePlugin(const char* name, const void* serialData, size_t serialLength)
{
    // This object will be deleted when the network is destroyed, which will
    // call AbsPlugin::destroy()
    return new AbsPlugin();
}

void AbsPluginCreator::setPluginNamespace(const char* pluginNamespace) { mNamespace = pluginNamespace; }

const char* AbsPluginCreator::getPluginNamespace() const { return mNamespace.c_str(); }

abs_kernel.h

#ifndef _ABS_KERNEL_H_
#define _ABS_KERNEL_H_
#include "NvInfer.h"

int absInference(
    cudaStream_t stream, 
    const void* src, 
    void* dst, 
    const int nElem
);

#endif // _ABS_KERNEL_H_

abs_kernel.cu

#include <abs_kernel.h>

template <typename T>
__global__ void abskernel(const T* src, T* dst, const int nElem) 
{
    
    int d = blockIdx.x * blockDim.x + threadIdx.x;
    if(d < nElem) {
        if(src[d] < 0) {
            dst[d] = -src[d];
        }
        else {
           dst[d] = src[d];
        }
    }
}

int absInference(cudaStream_t stream, const void* src, void* dst, const int nElem)
{
    const int blockSize_k = 512;
    const int gridSize = (nElem+blockSize_k -1) / blockSize_k;
    abskernel<float><<<gridSize, blockSize_k, 0, stream>>>(
        static_cast<const float*>(src), 
        static_cast<float*>(dst), 
        nElem
    );
    return 0;
}

i don’t understand just include AbsPlugin.h and instantiate AbsPlugin class member will effect model inference.

i think i have found the problem.

“myprogram.cpp” runs YoLo model only and i write a custom TensorRT plugin “std_plugin” for building YoLo network with TensorRT.

i also write a custom plugin “abs_plugin” for another model.

i compile these two custom plugins separately to two shared libraries, for instance std_plugin.so and abs_plugin.so than link them in CMakeLists.txt as below

target_link_libraries(myprogram 
    ${STD_PLUGIN_LIB} 
    ${ABS_PLUGIN_LIB} 
    ${NVINFER_LIB} 
    ${NVINFER_PLUGIN} 
    ${NVPARSERS} 
    ${NVONNXPARSER} 
    ${OpenCV_LIB}
)

and inlcude their header file and instantiate their class members in myprogram.cpp

#include "trt_plugin/space_to_depth.h"
#include "trt_plugin/abs_plugin.h"

#include "trt_model/trt_yolo.h"

nvinfer1::PluginFieldCollection SpaceToDepthCreator::mFC{};
std::vector<nvinfer1::PluginField> SpaceToDepthCreator::mPluginAttributes;

nvinfer1::PluginFieldCollection AbsPluginCreator::mFC{};
std::vector<nvinfer1::PluginField> AbsPluginCreator::mPluginAttributes;

the case above will effect the prediction of YoLo, there are a lot of object in the image can’t be detected.

then i change to compile them together and the prediction of YoLo is good(same as using TensorFlow model)

cuda_add_executable(
    myprogram myprogram.cpp 
    ${TRT_PLUGIN_SRC}/std.cpp 
    ${TRT_PLUGIN_SRC}/abs.cpp 
    ${TRT_PLUGIN_CU}/std_kernel.cu 
    ${TRT_PLUGIN_CU}/abs_kernel.cU 
)

target_link_libraries(myprogram 
    ${NVINFER_LIB} 
    ${NVINFER_PLUGIN} 
    ${NVPARSERS} 
    ${NVONNXPARSER} 
    ${OpenCV_LIB}
)

Actually i don’t understand what is happen behind the scene, do i miss something in the TensorRT document ?

Thanks.

Hi,

Do you add some parsing mechanism on it?
For example, compare the layer name to decide which plugin to use?

Thanks.

Hi, @AastaLLL Thanks for answering me so many questions.

i don’t add any extra parsing mechanism.

i register plugin in their header file, for instance

std_plugin.h

class SpaceToDepthPlugin: public nvinfer1::IPluginV2
{
    ....
};

class SpaceToDepthCreator: public nvinfer1::IPluginCreator
{
    ....
};
REGISTER_TENSORRT_PLUGIN(SpaceToDepthCreator);

abs_plugin.h

class AbsPlugin: public nvinfer1::IPluginV2
{
    ....
};

class AbsPluginCreator: public nvinfer1::IPluginCreator
{
    ....
};
REGISTER_TENSORRT_PLUGIN(AbsPluginCreator);

it should apply the plugin by name while building TensorRT network automatically right?

I check my YoLo model, there is only one space to depth node and i replace it to my plugin’s name at convert model stage as below.

namespace_plugin_map = {
    ...
    ...
    "net/SpaceToDepth": gs.create_plugin_node(name='spaceToDepth', op="SpaceToDepth_TRT", blockSize=2)
}

and there is no node name in YoLo model same as my AbsPlugin.

Thanks.

Hi,

You can implement PluginFactory function to handle this issue.
There is an example in /usr/src/tensorrt/samples/samplePlugin/samplePlugin.cpp

You can update the mechanism like this:

// integration for serialization
class PluginFactory : public nvinfer1::IPluginFactory, public nvcaffeparser1::IPluginFactoryExt
{
public:
    ...
    // deserialization plugin implementation
    IPlugin* createPlugin(const char* layerName, const void* serialData, size_t serialLength) override
    {   
        if (!strcmp(layerName, "layerA"))
        {
            assert(mLayerA.get() == nullptr);
            mLayerA = std::unique_ptr<LayerA>(new LayerA());
            return mLayerA.get();
        }
        else if (!strcmp(layerName, "layerB"))
        {
            ...

Thanks.