Can I convert yolo v2 caffemodel to tensorRT into 2 parts?

dedoogong · March 8, 2018, 5:56am

I have yolov2 caffe model and prototxt and custom layer (reorg layer)
Yolov2 net consts of standard conv, scale, batchnorm, relu, maxpool, concat layers(I believe those are standard layers which must be converted to TensorRT) and one custom Reorg layer.

It would look like
[Data->CBSR1->MXPOOL1->…->CBSR20->Concat1->CSBR21]-> Reorg → [Concat2->CBSR22->C23]

here, CBSR == Conv->BatchNorm->Scale->Relu.

So, I want to convert it into 2 parts divided by the custom layer point as below
First
[Input->CBSR1->MXPOOL1->CBSR->MXPOOL1->…->CBSR20->Concat1->CSBR21]

Second
[Concat2->CBSR22->C23]

Then, let’s assume, I have 2 trt model files, first.engine and second.engine.
How to separately convert 1 model into 2 parts and load those again?

What python or C++ API can I use to implement the Reorg layer plug in which work with those 2 trt engines?

I expect the flow would be like below

engine_1 = trt.utils.caffe_to_trt_engine(G_LOGGER, MODEL_PROTOTXT_1, CAFFE_MODEL_1,
1, 1 << 31,
OUTPUT_LAYERS_1, trt.infer.DataType.FLOAT)
runtime_ft_ext = trt.infer.create_infer_runtime(G_LOGGER)
context_ft_ext = engine_1.create_execution_context()

engine_2 = trt.utils.caffe_to_trt_engine(G_LOGGER, MODEL_PROTOTXT_2, CAFFE_MODEL_2,
1, 1 << 31,
OUTPUT_LAYERS_2, trt.infer.DataType.FLOAT)

runtime_loc = trt.infer.create_infer_runtime(G_LOGGER)
context_loc = engine_2.create_execution_context()

bindings_1 = [int(d_input_1), int(d_output_1)]
bindings_2 = [int(d_input_2), int(d_output_2)]
context_ft_ext.enqueue(1, bindings_1, stream.handle, None)
context_loc.enqueue(1, bindings_2, stream.handle, None)

where should I place the custom call for Reorg custom plug-in? if I implement it according to the existing plugin interface, does it load my custom layer and run reorg’s enqueue automatically?

p.s, should I split yolov2 caffemodel into 2parts and convert each one?

AastaLLL · March 8, 2018, 9:15am

Hi,

You don’t need to break your model into two parts.

As you know, TensorRT supports plugin API.
You can build the YOLO2 model by linking the plugin implementation to TensorRT.

Check this sample for details:

Thanks.

SeunghyunLee · March 12, 2018, 12:29am

thank you! I replaced Reorg with another supported layer. But I still need to plug in Leaky ReLU by myself;;; it would be not that hard as ReLU is so simple.

by the way, what size of Grid / Thread per Block size would be best for the performance?

in the face-recognition source, it uses

dim3 dimBlock(32,32);
dim3 dimGrid(dstSize[2]/dimBlock.x+1, dstSize[1]/dimBlock.y+1);
kernel_extract_roi <<< dimGrid, dimBlock, 0, stream >>> (…)

Then I need to know dstSize and in ReLU case, srcSize = dstSize.
in the face-recognition source, it calculates this size in the constructor as below;

[b]BboxMergeLayer::BboxMergeLayer(const void* buffer, size_t size)
{
assert(size==(9sizeof(int)));
const int d = reinterpret_cast<const int*>(buffer);

dimsData = DimsCHW{d[0], d[1], d[2]};
dimsConf = DimsCHW{d[3], d[4], d[5]};
dimsBbox = DimsCHW{d[6], d[7], d[8]};

}[/b]

hum, in ReLU case, I guess this should be as below

[b]LeakyReluLayer::LeakyReluLayer(const void* buffer, size_t size)
{
assert(size==(3sizeof(int)));
const int d = reinterpret_cast<const int*>(buffer);

dimsSrc = DimsCHW{d[0], d[1], d[2]};

}[/b]

Am I right??

Thank you!

dedoogong · March 12, 2018, 6:33am

Hum I replaced ReLU with Leaky ReLU but it shows it also doesn’t support Reshape layer!!! omg…

AastaLLL · March 12, 2018, 8:14am

Hi,

Should be like this:

virtual nvinfer1::IPlugin* createPlugin(const char* layerName, const nvinfer1::Weights* weights, int nbWeights) override
{
...
    else if(xxx.find(std::string(layerName)) != xxx.end())
    {
        assert(mLeakyReLULayer[i] == nullptr);
        assert(nbWeights == 0 && weights == nullptr);
        mLeakyReLULayer[i] = std::unique_ptr<LeakyReLULayer>(new LeakyReLULayer());
        return mLeakyReLULayer[i].get();
    }
}
...
void destroyPlugin()
{
    for (unsigned i = 0; i < xxx.size(); ++i)
    {
        mLeakyReLULayer[i].release();
        mLeakyReLULayer[i] = nullptr;
    }
}
std::unique_ptr<LeakyReLULayer> mLeakyReLULayer[22] { nullptr, ..., nullptr };

Thanks

dedoogong · March 19, 2018, 8:38am

thank you!

but I got seg fault result in tensorNet’s caffeToTRTModel method.

----------LOG----------------
…
…
…
outputs size : 23
blobNameToTensor : relu1
blobNameToTensor : relu2
blobNameToTensor : relu3
blobNameToTensor : relu4
blobNameToTensor : relu5
blobNameToTensor : relu6
blobNameToTensor : relu7
blobNameToTensor : relu8
blobNameToTensor : relu9
blobNameToTensor : relu10
blobNameToTensor : relu11
blobNameToTensor : relu12
blobNameToTensor : relu13
blobNameToTensor : relu14
blobNameToTensor : relu15
blobNameToTensor : relu16
blobNameToTensor : relu17
blobNameToTensor : relu18
blobNameToTensor : relu19
blobNameToTensor : relu20
blobNameToTensor : relu21
blobNameToTensor : relu22
blobNameToTensor : reorg1
blobNameToTensor : relu1
blobNameToTensor : relu2
Segmentation fault (core dumped)

----------tensorNet.cpp CODE----------------
…
…
…

[/b]

[b]printf("outputs size : %d\n",outputs.size());
for (auto& s : outputs) {

    printf("blobNameToTensor : %s\n",s.c_str());  
}

for (auto& s : outputs) {

    printf("blobNameToTensor : %s\n",s.c_str());
    [u]network->markOutput(*blobNameToTensor->find(s.c_str()));[/u]

}

[/b]
…
…
…

there are 22 relu layers…can you guess where I missed?
the error seems to occur in makring the output of relu3 plugin layer…

I set the output dimension of leaky relu as
DimsCHW(inputs[0].d[0], inputs[0].d[1], inputs[0].d[2]);

and I set the output dim of reorg as
DimsCHW(256, 19, 19); // this is right for 608x608 input.

hum…I think I should show you all of my code as below;;

------------------------------------------pluginImplement.h-------------------------------------------

using namespace nvinfer1;
using namespace nvcaffeparser1;
using namespace plugin;

class LeakyReLULayer : public IPlugin
{
public:
LeakyReLULayer() {};
LeakyReLULayer(const void* buffer, size_t size);

inline int getNbOutputs() const override { return 1; };
Dims getOutputDimensions(int index, const Dims* inputs, int nbInputDims) override;

int initialize() override;
inline void terminate() override { ; }

inline size_t getWorkspaceSize(int) const override { return 0; }
int enqueue(int batchSize, const void*const *inputs, void** outputs, void*, cudaStream_t stream) override;

size_t getSerializationSize() override;
void serialize(void* buffer) override;

void configure(const Dims*inputs, int nbInputs, const Dims* outputs, int nbOutputs, int) override;

protected:
DimsCHW dimsSrc;
};

class ReorgLayer : public IPlugin
{
public:
ReorgLayer() {};
ReorgLayer(const void* buffer, size_t size);

inline int getNbOutputs() const override { return 1; };
Dims getOutputDimensions(int index, const Dims* inputs, int nbInputDims) override;

int initialize() override;
inline void terminate() override { ; }

inline size_t getWorkspaceSize(int) const override { return 0; }
int enqueue(int batchSize, const void*const *inputs, void** outputs, void*, cudaStream_t stream) override;

size_t getSerializationSize() override;
void serialize(void* buffer) override;

void configure(const Dims*inputs, int nbInputs, const Dims* outputs, int nbOutputs, int) override;

protected:

DimsCHW dimsSrc;

};

class PluginFactory : public nvinfer1::IPluginFactory, public nvcaffeparser1::IPluginFactory
{
public:
virtual nvinfer1::IPlugin* createPlugin(const char* layerName, const nvinfer1::Weights* weights, int nbWeights) override;
IPlugin* createPlugin(const char* layerName, const void* serialData, size_t serialLength) override;

bool isPlugin(const char* name) override;
void destroyPlugin();
std::unique_ptr<LeakyReLULayer> mLeakyReLULayer[22] {nullptr, nullptr, nullptr, nullptr, nullptr,nullptr, nullptr, nullptr, nullptr, nullptr,nullptr, nullptr, nullptr, nullptr, nullptr,nullptr, nullptr, nullptr, nullptr, nullptr,nullptr, nullptr};
std::unique_ptr<ReorgLayer> mReorgLayer{ nullptr };

};

#endif

------------------------------------------pluginImplement.c-------------------------------------------
#include <pluginImplement.h>

//
// PluginFactory
//

//typedef std::unique_ptr LeakyReLULayerPtr;
//LeakyReLULayerPtr mLeakyReLULayer[22] {nullptr, nullptr, nullptr, nullptr, nullptr,nullptr, nullptr, nullptr, nullptr, nullptr,nullptr, nullptr, nullptr, nullptr, nullptr,nullptr, nullptr, nullptr, nullptr, nullptr,nullptr, nullptr};
int mReluCount = 0;
// allocate elements with this:

nvinfer1::IPlugin* PluginFactory::createPlugin(const char* layerName, const nvinfer1::Weights* weights, int nbWeights)
{
printf(“NV IPlugin layerName : %s \n”,layerName);
assert(isPlugin(layerName));

if( !strcmp(layerName, "relu1")
 || !strcmp(layerName, "relu2")
 || !strcmp(layerName, "relu3")
 || !strcmp(layerName, "relu4")
 || !strcmp(layerName, "relu5")
 || !strcmp(layerName, "relu6")
 || !strcmp(layerName, "relu7")
 || !strcmp(layerName, "relu8")
 || !strcmp(layerName, "relu9")
 || !strcmp(layerName, "relu10")
 || !strcmp(layerName, "relu11")
 || !strcmp(layerName, "relu12")
 || !strcmp(layerName, "relu13")
 || !strcmp(layerName, "relu14")
 || !strcmp(layerName, "relu15")
 || !strcmp(layerName, "relu16")
 || !strcmp(layerName, "relu17")
 || !strcmp(layerName, "relu18")
 || !strcmp(layerName, "relu19")
 || !strcmp(layerName, "relu20")
 || !strcmp(layerName, "relu21")
 || !strcmp(layerName, "relu22")){
    printf("NV IPlugin mReluCount : %d \n",mReluCount);
    //assert(mReluCount <22);
    assert(mLeakyReLULayer[mReluCount].get() == nullptr);
    mLeakyReLULayer[mReluCount] =  std::unique_ptr<LeakyReLULayer>(new LeakyReLULayer());
    //mLeakyReLULayer[mReluCount] = LeakyReLULayerPtr();
    return mLeakyReLULayer[mReluCount++].get();
}
else if(!strcmp(layerName, "reorg1")){
    printf("NV IPlugin reorg1 \n");
    assert(mReorgLayer.get() == nullptr);
    mReorgLayer= std::unique_ptr<ReorgLayer>(new ReorgLayer());
    return mReorgLayer.get();
}
else
{
    printf("NV NOT VALID PLUG IN LAYER \n");
    assert(0);
    return nullptr;
}

}

IPlugin* PluginFactory::createPlugin(const char* layerName, const void* serialData, size_t serialLength)
{
printf(“IPlugin layerName : %s \n”,layerName);
assert(isPlugin(layerName));

if( !strcmp(layerName, "relu1")
 || !strcmp(layerName, "relu2")
 || !strcmp(layerName, "relu3")
 || !strcmp(layerName, "relu4")
 || !strcmp(layerName, "relu5")
 || !strcmp(layerName, "relu6")
 || !strcmp(layerName, "relu7")
 || !strcmp(layerName, "relu8")
 || !strcmp(layerName, "relu9")
 || !strcmp(layerName, "relu10")
 || !strcmp(layerName, "relu11")
 || !strcmp(layerName, "relu12")
 || !strcmp(layerName, "relu13")
 || !strcmp(layerName, "relu14")
 || !strcmp(layerName, "relu15")
 || !strcmp(layerName, "relu16")
 || !strcmp(layerName, "relu17")
 || !strcmp(layerName, "relu18")
 || !strcmp(layerName, "relu19")
 || !strcmp(layerName, "relu20")
 || !strcmp(layerName, "relu21")
 || !strcmp(layerName, "relu22")){
    printf("IPlugin mReluCount : %d \n",mReluCount);
    //assert(mReluCount <22);
    assert(mLeakyReLULayer[mReluCount].get() == nullptr);
    //mLeakyReLULayer[mReluCount] = LeakyReLULayerPtr(serialData, serialLength);
    mLeakyReLULayer[mReluCount] =  std::unique_ptr<LeakyReLULayer>(new LeakyReLULayer(serialData, serialLength));
    return mLeakyReLULayer[mReluCount++].get();
}
else if(!strcmp(layerName, "reorg1")){
    printf("IPlugin reorg1 \n");
    assert(mReorgLayer.get() == nullptr);
    mReorgLayer= std::unique_ptr<ReorgLayer>(new ReorgLayer(serialData, serialLength));
    return mReorgLayer.get();
}
else
{   printf("NOT VALID PLUG IN LAYER \n");
    assert(0);
    return nullptr;
}

}

bool PluginFactory::isPlugin(const char* name)
{
if (!strcmp(name, “relu1”)
|| !strcmp(name, “relu2”)
|| !strcmp(name, “relu3”)
|| !strcmp(name, “relu4”)
|| !strcmp(name, “relu5”)
|| !strcmp(name, “relu6”)
|| !strcmp(name, “relu7”)
|| !strcmp(name, “relu8”)
|| !strcmp(name, “relu9”)
|| !strcmp(name, “relu10”)
|| !strcmp(name, “relu11”)
|| !strcmp(name, “relu12”)
|| !strcmp(name, “relu13”)
|| !strcmp(name, “relu14”)
|| !strcmp(name, “relu15”)
|| !strcmp(name, “relu16”)
|| !strcmp(name, “relu17”)
|| !strcmp(name, “relu18”)
|| !strcmp(name, “relu19”)
|| !strcmp(name, “relu20”)
|| !strcmp(name, “relu21”)
|| !strcmp(name, “relu22”)
|| !strcmp(name, “reorg1”)){
//printf(" Layer Name : %s \n",name);
return true;
}
else{
return false;
}
}

void PluginFactory::destroyPlugin()
{
for(int i=0; i<22; i++){
mLeakyReLULayer[i].release();
mLeakyReLULayer[i] = nullptr;
}
mReorgLayer.release();
mReorgLayer = nullptr;
}

//
// LeakyReLULayer Plugin Layer
//
//void convertROI(float* input, float* output, char* mean, const int* srcSize, const int* dstSize, const int* roi, cudaStream_t stream);
void leakyrelu_gpu(float* input, int * srcSize, float* output, cudaStream_t stream);

LeakyReLULayer::LeakyReLULayer(const void* buffer, size_t size)
{
printf(" LeakyReLULayer Constructor \n");
assert(size==(3sizeof(int)));
const int d = reinterpret_cast<const int*>(buffer);
dimsSrc = DimsCHW{d[0], d[1], d[2]};
}

Dims LeakyReLULayer::getOutputDimensions(int index, const Dims* inputs, int nbInputDims)
{
printf(" LeakyReLULayer getOutputDimensions → nbInputDims: %d\n",nbInputDims);
assert(nbInputDims==1);
return DimsCHW(inputs[0].d[0], inputs[0].d[1], inputs[0].d[2]);
}

int LeakyReLULayer::initialize()
{
printf(" LeakyReLULayer initialize \n");
return 0;
}

int LeakyReLULayer::enqueue(int batchSize, const voidconst inputs, void* outputs, void, cudaStream_t stream)
{
printf(" LeakyReLULayer enqueue \n");
int srcSize {dimsSrc.c(), dimsSrc.h(), dimsSrc.w()};
leakyrelu_gpu((float*)inputs[0], srcSize, (float*)outputs[0],stream);
return 0;
}

size_t LeakyReLULayer::getSerializationSize()
{
printf(" LeakyReLULayer getSerializationSize \n");
return 3*sizeof(int);
}

void LeakyReLULayer::serialize(void* buffer)
{
printf(" LeakyReLULayer serialize \n");
int* d = reinterpret_cast<int*>(buffer);
d[0] = dimsSrc.c(); d[1] = dimsSrc.h(); d[2] = dimsSrc.w();
}

void LeakyReLULayer::configure(const Dimsinputs, int nbInputs, const Dims outputs, int nbOutputs, int)
{
printf(" LeakyReLULayer configure \n");
dimsSrc = DimsCHW{inputs[0].d[0], inputs[0].d[1], inputs[0].d[2]};
}

//////////////////////////////////////////////////////////////////////////////////
//
// ReorgLayer Plugin Layer
//
//void convertROI(float* input, float* output, char* mean, const int* srcSize, const int* dstSize, const int* roi, cudaStream_t stream);
void reorg_gpu(float* input, int * srcSize, int* dstSize, float* output, cudaStream_t stream);

ReorgLayer::ReorgLayer(const void* buffer, size_t size)
{
printf(" ReorgLayer Constructorn");
assert(size==(3sizeof(int)));
const int d = reinterpret_cast<const int*>(buffer);

dimsSrc = DimsCHW{d[0], d[1], d[2]};
//dimsConf = DimsCHW{d[3], d[4], d[5]};
//dimsBbox = DimsCHW{d[6], d[7], d[8]};

}

Dims ReorgLayer::getOutputDimensions(int index, const Dims* inputs, int nbInputDims)
{
printf(" ReorgLayer getOutputDimensions → nbInputDims: %d\n",nbInputDims);
assert(nbInputDims==1);
return DimsCHW(256, 19, 19);
}

int ReorgLayer::initialize()
{ printf(" ReorgLayer initialize \n");
return 0;
}

int ReorgLayer::enqueue(int batchSize, const voidconst inputs, void* outputs, void, cudaStream_t stream)
{
printf(" ReorgLayer enqueue \n");
CHECK(cudaThreadSynchronize());

int srcSize[] {dimsSrc.w(), dimsSrc.h(), dimsSrc.c()};
int dstSize[] {19, 19, 256};
reorg_gpu((float*)inputs[0], srcSize, dstSize, (float*)outputs[0],stream);
return 0;

}

size_t ReorgLayer::getSerializationSize()
{ printf(" ReorgLayer getSerializationSize \n");
return 3*sizeof(int);
}

void ReorgLayer::serialize(void* buffer)
{ printf(" ReorgLayer serialize \n");
int* d = reinterpret_cast<int*>(buffer);
d[0] = dimsSrc.c(); d[1] = dimsSrc.h(); d[2] = dimsSrc.w();
}

void ReorgLayer::configure(const Dimsinputs, int nbInputs, const Dims outputs, int nbOutputs, int)
{ printf(" ReorgLayer configure \n");
dimsSrc = DimsCHW{inputs[0].d[0], inputs[0].d[1], inputs[0].d[2]};
}

------------------------------------------tensorNet.cpp-------------------------------------------

void TensorNet::caffeToTRTModel(const std::string& deployFile,
const std::string& modelFile,
const std::vectorstd::string& outputs,
unsigned int maxBatchSize)
{
IBuilder* builder = createInferBuilder(gLogger);
INetworkDefinition* network = builder->createNetwork();

ICaffeParser* parser = createCaffeParser();
parser->setPluginFactory(&pluginFactory);
bool useFp16 = builder->platformHasFastFp16();
DataType modelDataType = useFp16 ? DataType::kHALF : DataType::kFLOAT;

const IBlobNameToTensor *blobNameToTensor =	parser->parse(deployFile.c_str(),
                                                          modelFile.c_str(),
                                                          *network,
                                                          modelDataType);
assert(blobNameToTensor != nullptr);
for (auto& s : outputs) {

    printf("blobNameToTensor : %s\n",s.c_str());
    network->markOutput(*blobNameToTensor->find(s.c_str()));

}

builder->setMaxBatchSize(maxBatchSize);
builder->setMaxWorkspaceSize(16 << 20);

if(useFp16) builder->setHalf2Mode(true);

ICudaEngine* engine = builder->buildCudaEngine(*network);
assert(engine);
network->destroy();
parser->destroy();

gieModelStream = engine->serialize();
engine->destroy();
builder->destroy();
pluginFactory.destroyPlugin();
shutdownProtobufLibrary();

}
…
…
…

and the seg fault result is as below;

…
…
…

v4l2src device=/dev/video0 ! video/x-raw, width=(int)1280, height=(int)720, format=RGB ! videoconvert ! video/x-raw, format=RGB ! videoconvert !appsink name=mysink
successfully initialized video device
width: 1280
height: 720
depth: 24 (bpp)

loadLabelInfo
successfully load LabelInfo
NV IPlugin layerName : relu1
NV IPlugin mReluCount : 0
LeakyReLULayer getOutputDimensions → nbInputDims: 1
NV IPlugin layerName : relu2
NV IPlugin mReluCount : 1
LeakyReLULayer getOutputDimensions → nbInputDims: 1
NV IPlugin layerName : relu3
NV IPlugin mReluCount : 2
LeakyReLULayer getOutputDimensions → nbInputDims: 1
NV IPlugin layerName : relu4
NV IPlugin mReluCount : 3
LeakyReLULayer getOutputDimensions → nbInputDims: 1
NV IPlugin layerName : relu5
NV IPlugin mReluCount : 4
LeakyReLULayer getOutputDimensions → nbInputDims: 1
NV IPlugin layerName : relu6
NV IPlugin mReluCount : 5
LeakyReLULayer getOutputDimensions → nbInputDims: 1
NV IPlugin layerName : relu7
NV IPlugin mReluCount : 6
LeakyReLULayer getOutputDimensions → nbInputDims: 1
NV IPlugin layerName : relu8
NV IPlugin mReluCount : 7
LeakyReLULayer getOutputDimensions → nbInputDims: 1
NV IPlugin layerName : relu9
NV IPlugin mReluCount : 8
LeakyReLULayer getOutputDimensions → nbInputDims: 1
NV IPlugin layerName : relu10
NV IPlugin mReluCount : 9
LeakyReLULayer getOutputDimensions → nbInputDims: 1
NV IPlugin layerName : relu11
NV IPlugin mReluCount : 10
LeakyReLULayer getOutputDimensions → nbInputDims: 1
NV IPlugin layerName : relu12
NV IPlugin mReluCount : 11
LeakyReLULayer getOutputDimensions → nbInputDims: 1
NV IPlugin layerName : relu13
NV IPlugin mReluCount : 12
LeakyReLULayer getOutputDimensions → nbInputDims: 1
NV IPlugin layerName : relu14
NV IPlugin mReluCount : 13
LeakyReLULayer getOutputDimensions → nbInputDims: 1
NV IPlugin layerName : relu15
NV IPlugin mReluCount : 14
LeakyReLULayer getOutputDimensions → nbInputDims: 1
NV IPlugin layerName : relu16
NV IPlugin mReluCount : 15
LeakyReLULayer getOutputDimensions → nbInputDims: 1
NV IPlugin layerName : relu17
NV IPlugin mReluCount : 16
LeakyReLULayer getOutputDimensions → nbInputDims: 1
NV IPlugin layerName : relu18
NV IPlugin mReluCount : 17
LeakyReLULayer getOutputDimensions → nbInputDims: 1
NV IPlugin layerName : relu19
NV IPlugin mReluCount : 18
LeakyReLULayer getOutputDimensions → nbInputDims: 1
NV IPlugin layerName : relu20
NV IPlugin mReluCount : 19
LeakyReLULayer getOutputDimensions → nbInputDims: 1
NV IPlugin layerName : relu21
NV IPlugin mReluCount : 20
NV IPlugin layerName : reorg1
NV IPlugin reorg1
LeakyReLULayer getOutputDimensions → nbInputDims: 1
ReorgLayer getOutputDimensions → nbInputDims: 1
NV IPlugin layerName : relu22
NV IPlugin mReluCount : 21
LeakyReLULayer getOutputDimensions → nbInputDims: 1
outputs size : 23
blobNameToTensor : relu1
blobNameToTensor : relu2
blobNameToTensor : relu3
blobNameToTensor : relu4
blobNameToTensor : relu5
blobNameToTensor : relu6
blobNameToTensor : relu7
blobNameToTensor : relu8
blobNameToTensor : relu9
blobNameToTensor : relu10
blobNameToTensor : relu11
blobNameToTensor : relu12
blobNameToTensor : relu13
blobNameToTensor : relu14
blobNameToTensor : relu15
blobNameToTensor : relu16
blobNameToTensor : relu17
blobNameToTensor : relu18
blobNameToTensor : relu19
blobNameToTensor : relu20
blobNameToTensor : relu21
blobNameToTensor : relu22
blobNameToTensor : reorg1
blobNameToTensor : relu1
blobNameToTensor : relu2
Segmentation fault (core dumped)

I tried to debug it myself but I failed to find the cause of this error and where it is.

Please help me one more time;;;

Thank you!!

AastaLLL · March 22, 2018, 6:53am

Hi,

Could you verify the plugin implement with single leaky relu layer first.
This will help you narrow down the segmentation fault is from plugin implementation or multiple pointers.

Thanks.

Topic		Replies	Views
Tensor RT supports caffe model layers Jetson TX1	28	10787	October 18, 2021
TensorRT 2.1 implement yoloV2 with fp16 mode result error Jetson TX1	8	1287	July 18, 2019
caffeToGIEModel() segmentation fault Jetson TX1	9	1671	October 18, 2021
Conversion from caffemodel to TensorRT Jetson Nano tensorrt	7	1650	October 18, 2021
How to build the objection detection framework SSD with tensorRT on tx2? Jetson TX2	96	22834	February 21, 2018
problem adding custom TensorRT layer to a network defined using TensorRT API TensorRT	5	1597	May 15, 2018
TensorRT YOLO inference error Jetson TX1	21	12734	October 18, 2021
Inferring detectnet_v2 .trt model in python TAO Toolkit tensorrt	58	4120	August 17, 2021
Converting Caffe model to TensorRT Jetson TX2	33	11884	October 18, 2021
I don't get similar results with TensorRT and the trained tensorflow model! Jetson TX2	20	4682	October 18, 2021

Can I convert yolo v2 caffemodel to tensorRT into 2 parts?

I expect the flow would be like below

bindings_1 = [int(d_input_1), int(d_output_1)] bindings_2 = [int(d_input_2), int(d_output_2)] context_ft_ext.enqueue(1, bindings_1, stream.handle, None) context_loc.enqueue(1, bindings_2, stream.handle, None)

dim3 dimBlock(32,32); dim3 dimGrid(dstSize[2]/dimBlock.x+1, dstSize[1]/dimBlock.y+1); kernel_extract_roi <<< dimGrid, dimBlock, 0, stream >>> (…)

}[/b]

}[/b]

Related topics

bindings_1 = [int(d_input_1), int(d_output_1)]
bindings_2 = [int(d_input_2), int(d_output_2)]
context_ft_ext.enqueue(1, bindings_1, stream.handle, None)
context_loc.enqueue(1, bindings_2, stream.handle, None)

dim3 dimBlock(32,32);
dim3 dimGrid(dstSize[2]/dimBlock.x+1, dstSize[1]/dimBlock.y+1);
kernel_extract_roi <<< dimGrid, dimBlock, 0, stream >>> (…)