I use R-FCN to execute object detection. I want to implement R-FCN on TensorRT to accelerate the inference speed.
As I know, TensorRT provides sample of Faster R-CNN. In the sample of Faster R-CNN, Proposal and RoI Pooling are combined into one plugin. I want to ask how to replace RoI Pooling with PSRoI Pooling in current plugin of Faster R-CNN.
Thanks.
Hi,
PSRoI is not in our supported matrix currently.
Since we don’t release the source of roi pooling plugin, you will need to implement it by yourself.
Thanks.
Hi,
When I implement my plugin. I have the following questions:
-
If my custom layer has no weights, how do I initialize my plugin?
-
Does TensorRT supports DenseNet?
Thanks.
Hi,
1. You can still initial a plugin without weight.
int PSRoiLayer::initialize()
{
return 0;
}
2. You can check if all the layers of your model is supported by the TensorRT.
Here is our support matrix for your reference:
https://docs.nvidia.com/deeplearning/sdk/tensorrt-support-matrix/index.html
Thanks.
Thank you for your reply.
I have another question during implementing my own plugin.
I notice that the function caffeToTRTModel() receives IPluginFactory* as a parameter if I have external plugin. One IPluginFactory creates one external plugin. However, I have more than one plugins, how should I parse IPluginFactory* to caffeToTRTModel() to integrate several plugins.
Thank you again.
Hi,
You can compare the layer name to create the corresponding plugin.
For example:
nvinfer1::IPlugin* PluginFactory::createPlugin(const char* layerName, const nvinfer1::Weights* weights, int nbWeights)
{
assert(isPlugin(layerName));
if (!strcmp(layerName, "bboxMerge"))
{
assert(mBboxMergeLayer.get() == nullptr);
mBboxMergeLayer = std::unique_ptr<BboxMergeLayer>(new BboxMergeLayer());
return mBboxMergeLayer.get();
}
else if (!strcmp(layerName, "dataRoi"))
{
assert(mDataRoiLayer.get() == nullptr);
mDataRoiLayer = std::unique_ptr<DataRoiLayer>(new DataRoiLayer());
return mDataRoiLayer.get();
}
else if (!strcmp(layerName, "selectBbox"))
{
assert(mSelectLayer.get() == nullptr);
mSelectLayer = std::unique_ptr<RecognitionLayer>(new RecognitionLayer(FunctionType::SELECT));
return mSelectLayer.get();
}
else if (!strcmp(layerName, "summaryLabel"))
{
assert(mSummaryLayer.get() == nullptr);
mSummaryLayer = std::unique_ptr<RecognitionLayer>(new RecognitionLayer(FunctionType::SUMMARY));
return mSummaryLayer.get();
}
else
{
assert(0);
return nullptr;
}
}
Here is a sample for TensorRT2.1 for your reference.
Some API may be updated but the overall mechanism is the same:
https://github.com/AastaNV/Face-Recognition/blob/master/pluginImplement.cpp
Thanks.
Hi,
Thank you for your reply. I have another question.
If I want to know the input shape of my own layer(plugin), how should I do that? In the example Face-Recognition you provide, it seems that you get the input shape in the following way:
BboxMergeLayer::BboxMergeLayer(const void* buffer, size_t size)
{
assert(size==(9*sizeof(int)));
const int* d = reinterpret_cast<const int*>(buffer);
dimsData = DimsCHW{d[0], d[1], d[2]};
dimsConf = DimsCHW{d[3], d[4], d[5]};
dimsBbox = DimsCHW{d[6], d[7], d[8]};
}
Does ‘const void* buffer’ come from binary weight file(e.g. caffemodel)? Does that mean my caffemodel file must contains shape information? Can you provide the method to get the input shape of my own layer?
Thank you.
Hi,
Information of input dimension should be included in this function:
Dims BboxMergeLayer::getOutputDimensions(int index, const Dims* inputs, int nbInputDims)
{
assert(nbInputDims==3);
// input[0] -> the size of first axis
...
return DimsCHW(1, 1, 1);
}
Thanks
Hi,
Thank you for your reply.
If layer B is the input of layer A, I write B::getOutputDimensions(int index, const Dims* inputs, int nbInputDims) to obtain the input shape of layer A. Do I understand correctly?
Thank you.
Hi,
For layer A, in this function:
Dims BboxMergeLayer::getOutputDimensions(int index, const Dims* inputs, int nbInputDims)
nbInputDims tells you the input dimension number.
inputs tells the size of each dimension. ex. inputs[0] is the size of axis 0.
And you will need to return the size of layer output.
For example, this means the tensor output of this layer is 1x1x1.
return DimsCHW(1, 1, 1);
Please noticed the dimension here is CHW only. You don’t need to take care of batchsize value here.
Thanks.
Hi,
Thank you.
Dims BboxMergeLayer::getOutputDimensions(int index, const Dims* inputs, int nbInputDims) is to get the output dimension of
BboxMergeLayer. However, I want to know the input dimension of BboxMergeLayer in BboxMergeLayer::enqueue function.
In function BboxMergeLayer::enqueue
int BboxMergeLayer::enqueue(int batchSize, const void*const *inputs, void** outputs, void*, cudaStream_t stream)
How can I get “const Dims* inputs” from “const void* const * inputs”?
Thank you again.
Hi,
The getOutputDimensions function is called when creating TensorRT engine.
Please restore the value with a variable if you need the dimension information.
We don’t pass the dimension information in the inferencing time.
Thanks.
Hi,
Thank you.
BboxMergeLayer::BboxMergeLayer(const void* buffer, size_t size)
{
assert(size==(9*sizeof(int)));
const int* d = reinterpret_cast<const int*>(buffer);
dimsData = DimsCHW{d[0], d[1], d[2]};
dimsConf = DimsCHW{d[3], d[4], d[5]};
dimsBbox = DimsCHW{d[6], d[7], d[8]};
}
Does this function obtain the dimension information during inference?
void BboxMergeLayer::serialize(void* buffer)
{
int* d = reinterpret_cast<int*>(buffer);
d[0] = dimsData.c(); d[1] = dimsData.h(); d[2] = dimsData.w();
d[3] = dimsConf.c(); d[4] = dimsConf.h(); d[5] = dimsConf.w();
d[6] = dimsBbox.c(); d[7] = dimsBbox.h(); d[8] = dimsBbox.w();
}
Does this function restore the dimension information during inference?
Thank you again.
Hi,
We provide a serialize API to save the TensorRT engine directly without using model parser each time.
To do so, you will need to save/restore all the required parameter in the plugin layer.
The first one is called when deserializing while the other saves the parameters when serializing.
The dimension information is only passed when parsing a caffe/tf/onnx model.
You will need to save the information as a variable and serialize it into the file.
Thanks.
Hi,
Thank you.
Does that mean TensorRT does not support dynamic shape of input image? I should fix the shape of input image during inference. Do I understand correctly?
Thank you again.
Hi,
You are right.
TensorRT doesn’t support dynamic input.
This is because we allocate all the memory when creating.
The dynamic mechanism tends to degrade the inference performance.
Thanks.
Hi,
I have new questions to disturb you. It is about mAP of the detection model on TensorRT.
For original Faster R-CNN, mAP@0.5 drops about 13% (e.g. 95.10% vs 82.30%) after deploying it on TensorRT. I check the ‘rois’ of both models (caffe or TensorRT). I find that the ‘rois’ are completely different.
For R-FCN implemented by us, mAP@0.5 even drops more.
Do you have any suggestions?
Thank you.
Hi,
Which precision do you use?
If you are using FP16 or INT8, could you try FP32 to see if helps first?
Thanks.
Hi,
It is strange that I use FP32 precision.
However, our testing images have dynamic sizes. Thus, before I send testing images to TensorRT model, I resize testing images into fixed size of 1000x300. Could this operation influence the final mAP?
Thank you.