Hi, I am writing a trt plugin with IPluginV2DynamicExt. For my business, I want to give 3 outputs for my plugin, the first&second stored in DEVICE(GPU) memory, and last one stored in HOST(CPU). The last output just holds a flag information which produced by the algorithm. I don’t want to put it in GPU because in the next layer enqueue, I need to copy it back to CPU memory which is time consuming.
What I wanna know is:
- Does tensorRT support this situation I mentioned?
- If it does, how should I do?
TensorRT Version: 126.96.36.199
GPU Type: xavier
Nvidia Driver Version: 440.33
CUDA Version: 10.2
CUDNN Version: 8.0
Operating System + Version: ubuntu 18.04
Python Version (if applicable): 3.7
PyTorch Version (if applicable): 1.4