A clear and concise description of the bug or issue.
TensorRT Version -> 7.1:
GPU Type -> AWS t4:
Nvidia Driver Version -> 440.82:
CUDA Version -> 10.2:
CUDNN Version -> 7.6.5:
Operating System + Version:
Python Version (if applicable):
TensorFlow Version (if applicable):
PyTorch Version (if applicable):
Baremetal or Container (if container which image + tag)->:
Please attach or include links to any models, data, files, or scripts necessary to reproduce your issue. (Github repo, Google Drive, Dropbox, etc.)
Steps To Reproduce
So, I am using this repo to implement yolov5 in DeepStream after facing many issues and getting incorrect outputs, figured out that there is a bug in that repo which is that The kernels in plugin layer should use the same incoming cuda stream which it receives in the call to “enqueue”. This will ensure in-order execution of all the kernels in the entire network.
I dont have much experience in writing TensorRT c++ api plugins. So, I want to know how to make the plugin layer use the same cuda stream it recieved during the call to enqueue which will ensure in order execution of all the kernels.