Plugin layer should use the same cuda stream which recieved in enqueue call

Description

A clear and concise description of the bug or issue.

Environment

TensorRT Version -> 7.1:
GPU Type -> AWS t4:
Nvidia Driver Version -> 440.82:
CUDA Version -> 10.2:
CUDNN Version -> 7.6.5:
Operating System + Version:
Python Version (if applicable):
TensorFlow Version (if applicable):
PyTorch Version (if applicable):
Baremetal or Container (if container which image + tag)->:

Relevant Files

Please attach or include links to any models, data, files, or scripts necessary to reproduce your issue. (Github repo, Google Drive, Dropbox, etc.)

Steps To Reproduce

So, I am using this repo to implement yolov5 in DeepStream after facing many issues and getting incorrect outputs, figured out that there is a bug in that repo which is that The kernels in plugin layer should use the same incoming cuda stream which it receives in the call to “enqueue”. This will ensure in-order execution of all the kernels in the entire network.

https://github.com/wang-xinyu/tensorrtx/blob/40f6009b18bf6abd8ce19062eb1687875bf013f1/yolov5/yololayer.cu#L215

I dont have much experience in writing TensorRT c++ api plugins. So, I want to know how to make the plugin layer use the same cuda stream it recieved during the call to enqueue which will ensure in order execution of all the kernels.

Thanks

Hi @y14uc339,
Please find the document for your reference for adding custom plugin:
https://docs.nvidia.com/deeplearning/tensorrt/sample-support-guide/index.html#plugin_sample
Here is the example for the same.


Thanks!
1 Like