I have built a YOLOV2 net using TensorRT 3.0 in TX2 with some plugin layers which are reorg, region, concat and PReLU, where concat layer is as the same function of route layer in darknet, and it can get the right result. While I run the net in Jetson-inference detectnet-camera, it could only run up to 4.5fps compared with 37.5fps using facenet-120 of Jetson-Inference.
I only add PluginFactory as:
parser->setPluginFactory(&pluginFactory);
, and build engine as
nvinfer1::ICudaEngine* engine = infer->deserializeCudaEngine(modelMem, modelSize, &pluginFactory);
, and these are the main differences with Jetson-Inference source code. At the same time, it runs at the same speed of forward computing using fp16 and fp32.
Could someone give me some suggestions? Thank you in advance!