How to optimize custom tensorrt plugin performance

I created a custom deepstream plugin with tensorrt inference , i am not getting performance anywhere close to nvinfer, the same detection model is giving 50 fps on nvinfer but 6 fps on custom tensorrt plugin, precison for both is fp32.
Although nvinfer is also using tensorrt inside it for inference, i cannot find the exact inference code part(enqueueV2 or execute V2 API) in nvinfer source code.

• Jetson Xavier
• DeepStream 6.0.1
• JetPack Version (valid for Jetson only)
• TensorRT Version
• NVIDIA GPU Driver Version (valid for GPU only)

There is no update from you for a period, assuming this is not an issue anymore. Hence we are closing this topic. If need further support, please open a new one. Thanks

The enqueueV2 API is evoked in function FullDimTrtBackendContext::enqueueBuffer() and DlaFullDimTrtBackendContext::enqueueBuffer() in /opt/nvidia/deepstream/deepstream/sources/libs/nvdsinfer/nvdsinfer_backend.cpp

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.