TensorRT GPU Inference on Volumes

Description

From this issue I understand that one can feed in GPU memory locations to a de-serialized inference engine and the engine will then run inference using the memory locations of the input tensor(s) and then output to the memory location of the output tensor. This is clear when the input memory structure is linear, but what if one is working with volumes in Cuda Arrays? This type of structure doesn’t really have a pointer in the same sense and under normal circumstances one would have to use surf3Dwrite() and surf3Dread(). If one wants to use TensorRT on these to run inference is there a way to do this besides doing a GPU copy?

Environment

TensorRT Version: 7
CUDA Version: 10.2
CUDNN Version: 7.6
Operating System + Version: Windows 10 64-bit

I don’t think there is any other way to handle TRT inference without GPU copy command while working with volumes.
TRT only takes CUDA memory pointers at network boundaries, not CUDA arrays.

Thanks

Thanks for the clarification @SunilJB