Run multiple model(engine) with tensorrt without deepstream

jalalmiry · April 17, 2020, 4:12pm

Description

Hi,
how can configure multiple model (back to back pipeline) with tensorrt . (multi-thread). in this configuration how set context, allocate memory for multiple engine.
in my configuration i take below error in memory :

boxes_2, confs_2, clss_2 = trt_yolov3.detect(img, conf_th)
File “/home/cv/Downloads/tensorrt_demos-master/tensorrt_demos-master/utils2/yolov3.py”, line 479, in detect
stream=self.stream)
File “/home/cv/Downloads/tensorrt_demos-master/tensorrt_demos-master/utils2/yolov3.py”, line 367, in do_inference
[cuda.memcpy_htod_async(inp.device, inp.host, stream) for inp in inputs]
File “/home/cv/Downloads/tensorrt_demos-master/tensorrt_demos-master/utils2/yolov3.py”, line 367, in
[cuda.memcpy_htod_async(inp.device, inp.host, stream) for inp in inputs]
pycuda._driver.LogicError: cuMemcpyHtoDAsync failed: invalid argument

A clear and concise description of the bug or issue.

Environment

TensorRT Version:
GPU Type:
Nvidia Driver Version:
CUDA Version:
CUDNN Version:
Operating System + Version:
Python Version (if applicable):
TensorFlow Version (if applicable):
PyTorch Version (if applicable):
Baremetal or Container (if container which image + tag):

Relevant Files

Please attach or include links to any models, data, files, or scripts necessary to reproduce your issue. (Github repo, Google Drive, Dropbox, etc.)

Steps To Reproduce

Please include:

Exact steps/commands to build your repro
Exact steps/commands to run your repro
Full traceback of errors encountered

SunilJB · April 20, 2020, 8:54am

Hi,

In order to run multiple model with TensorRT, i will recommend you to either use NVIDIA deepstream or NVIDIA Triton Inference Server.
Please refer below link for more details:

If you want to perform multi threading using TensorRT, please refer below link for best practices:
https://docs.nvidia.com/deeplearning/sdk/tensorrt-archived/tensorrt-700/tensorrt-best-practices/index.html#thread-safety

Thanks

Topic		Replies	Views
How to inference with tensorrt on multi gpus in python TensorRT	2	2098	April 9, 2021
Tensorrt Threads affect each other during multithreaded inference TensorRT tensorrt	16	1310	September 6, 2024
Multi-model parallel inferencing TensorRT	1	353	March 31, 2023
Not able to inference multiple input models using TRT TensorRT tensorrt , tensorflow , jetson-inference	1	436	August 12, 2021
Multiple context and/or multithreading TensorRT	1	1213	March 24, 2022
Concurrent tensorRT engines TensorRT jetson	1	383	December 5, 2022
Unable to do inference of multiple engines in parallel TensorRT tensorrt , nano	3	1680	May 6, 2022
Run two YOLOv3 models with CUDA Stream use TensorRT have a lot of cudaEventRecord TensorRT	3	899	May 13, 2020
How to run inference in multithread( only Allocate host and device buffers once for all execution contexts) TensorRT tensorrt	1	457	July 14, 2021
Run inference on a batch of images & parallel inference using cuda on python threads TensorRT tensorrt , cuda	6	2229	January 6, 2022

Run multiple model(engine) with tensorrt without deepstream

Description

Environment

Relevant Files

Steps To Reproduce

Related topics