Openvx/Visionworks graph input from GPU memory Buffer

armand.zampierizn4wa · July 8, 2021, 12:13pm

Description

Hello,

I am facing a problem with Siamrpn++ inferencing with tensorrt. This problem is present for
many similar network architecture and is, from what i learned, due to the non-support of the cross-correlation between two dynamic input by tensorrt (tensorrt seems to require a static kernel for this kind of operations).

To solve this issue the solutions found yet are to crop the network architecture before the cross-correlation operation and to do those either manually or using other framework.

One functionnal solution was to couple tensorrt with a second inference engine (onnxruntime) which support this operation. However for unknown reasons performances were terrible when working with onnxruntime (more than 500ms for a single inference while having around 60ms on the pc-based version for the whole network. the hardware difference justify a part of the gap still the gap is not coherent between what was seen on resnet-50 for comparison).

An other solution seem to reimplement the operation in Cuda which seems particularly time consuming and not portable.
The last solution which I am exploring is using openvx/visionworks and the vxMatchTemplate node to implement the mentioned operation.

IN the scope of the last solution I am trying to setup Visionworks image memory to an already allocated GPU buffer (output of tensorrt) but couldn’t find how to do so.
It seems vxCreateImageFromHandle would be a good starting point but the Memory type parameter dontains only Host and None type which doesn’t seems to correspond to GPU memory (thought memory is shared I don’t think pointer are interchangeable this way).

So the question is is there a way to do this in a correct way ? also if you have any recommandation concerning the whole problematic mentionned above, that would be greatly appreciated.

Thank you,
Regards.

A clear and concise description of the bug or issue.

Environment

TensorRT Version: 7.1.3-1
GPU Type: jetson TX2 gpu
Nvidia Driver Version:
CUDA Version: 10.2
CUDNN Version: 8.0.0.1810-1
Operating System + Version: l4t R32
Python Version (if applicable):
TensorFlow Version (if applicable):
PyTorch Version (if applicable):
Baremetal or Container (if container which image + tag):

NVES · July 8, 2021, 12:37pm

Hi,
This looks like a Jetson issue. We recommend you to raise it to the respective platform from the below link

Thanks!

armand.zampierizn4wa · July 8, 2021, 12:46pm

Ok moved here:

Thought tensorrt issue mentionned is not jetson specific (observed on a windows plateform too), and same for visionworks buffer issue (thought visionworks is mainly oriented toward jetson)

Topic		Replies	Views
Openvx/Visionworks graph input from GPU memory Buffer Jetson TX2 tensorrt , cuda , visionworks	5	1445	October 18, 2021
Jetson Nano: Deepstream Plugin Memory Management for OpenVX DeepStream SDK cuda , gstreamer	6	1606	October 12, 2021
Wraping Buffer Data using Pointers to Avoid Data Copy TensorRT tensorrt	3	1284	December 1, 2020
Memory in_mat and out_mat is CPU memory or GPU memory (urgent) DeepStream SDK	4	575	June 18, 2019
image copy using OpenCV and Visionworks nvxuCopyImage function Jetson TK1	2	4348	January 8, 2016
Converting mat to vx_image and back Jetson TX1	11	3622	October 18, 2021
Transfer video frames from a PCIe capture card to Jetson TX1 device memory (for RT video processing) Jetson TX1	20	6226	June 1, 2018
Is there efficient way to move cv::Mat to tensorrt 's buffer TensorRT	1	1123	July 25, 2022
How to create opencv gpumat from nvstream? DeepStream SDK	36	15455	July 27, 2021
Using a CUDA buffer with visionworks Jetson AGX Xavier cuda , visionworks	4	616	October 18, 2021

Openvx/Visionworks graph input from GPU memory Buffer

Description

Environment

Related topics