Plugin layer should use the same cuda stream which recieved in enqueue call

y14uc339 · July 7, 2020, 8:59am

Description

A clear and concise description of the bug or issue.

Environment

TensorRT Version → 7.1:
GPU Type → AWS t4:
Nvidia Driver Version → 440.82:
CUDA Version → 10.2:
CUDNN Version → 7.6.5:
Operating System + Version:
Python Version (if applicable):
TensorFlow Version (if applicable):
PyTorch Version (if applicable):
Baremetal or Container (if container which image + tag)->:

Relevant Files

Please attach or include links to any models, data, files, or scripts necessary to reproduce your issue. (Github repo, Google Drive, Dropbox, etc.)

Steps To Reproduce

So, I am using this repo to implement yolov5 in DeepStream after facing many issues and getting incorrect outputs, figured out that there is a bug in that repo which is that The kernels in plugin layer should use the same incoming cuda stream which it receives in the call to “enqueue”. This will ensure in-order execution of all the kernels in the entire network.

https://github.com/wang-xinyu/tensorrtx/blob/40f6009b18bf6abd8ce19062eb1687875bf013f1/yolov5/yololayer.cu#L215

I dont have much experience in writing TensorRT c++ api plugins. So, I want to know how to make the plugin layer use the same cuda stream it recieved during the call to enqueue which will ensure in order execution of all the kernels.

Thanks

AakankshaS · July 7, 2020, 2:25pm

Hi @y14uc339,
Please find the document for your reference for adding custom plugin:

Here is the example for the same.
https://github.com/NVIDIA/TensorRT/tree/master/samples/opensource/samplePlugin
Thanks!

Topic		Replies	Views
Is there any clearer official doc of customizing plugins in TensorRT? Jetson AGX Xavier	16	1020	October 18, 2021
A problem about deepstream DeepStream SDK	4	927	October 12, 2021
SSD with NON_BLOCKING stream TensorRT	1	932	September 26, 2019
[Question] trtexec understanding issue TensorRT	4	1025	December 6, 2021
How to implement a kernel function for a custom plugin's enqueue? TensorRT	3	903	October 12, 2021
Batch inference parallelization on tensorrt TensorRT tensorrt , cuda	5	976	May 5, 2021
Deepstream-plugin make failed DeepStream SDK	4	876	October 5, 2018
TensorRT 9.3 Custom plugins appear to be strangely time-consuming TensorRT tensorrt , ubuntu , cudnn	3	28	July 22, 2024
TRT concurrently Jetson TX2 tensorrt	7	1124	September 5, 2021
A simple example on the custom layer API (TensorRT 2.1)? Jetson TX2	28	6648	October 18, 2021

Plugin layer should use the same cuda stream which recieved in enqueue call

Description

Environment

Relevant Files

Steps To Reproduce

Related topics