How to use the multiple execution contexts of trt engine in parallel? How to properly use multi threading?

naserie97 · November 23, 2023, 11:40am

As mentioned in the developer guide: (An engine can have multiple execution contexts, allowing one set of weights to be used for multiple overlapping inference tasks)

Also I read they can be used in parallel. does that mean invoking 2 execution contexts in parallel? like for example perform two detection on two different images in parallel using the same model? if yes how to do that because I think normal threading in python does not work with tensorrt like I’ve tried to put the inference process in a thread and it failed till I have done the following:

class MyEngine(object):

    # Initialize the Engines 
    def __init__(self, engine_paths, classes_paths, source, imgsz, img_label, vid_size, ...):
          self.cfx = cuda.Device(0).make_context()

          ...

          TRT_LOGGER is Used for Logging
          logger = trt.Logger(trt.Logger.WARNING)
          logger.min_severity = trt.Logger.Severity.ERROR
          # runtime is an Instance of the TensorRT runtime
          runtime = trt.Runtime(logger)
          trt.init_libnvinfer_plugins(logger,'') # initialize TensorRT plugins
          self.stream = cuda.Stream()

    def inference(self):
          ...



 class myThread(threading.Thread):
    def __init__(self, func, args):
       threading.Thread.__init__(self)
       self.func = func
       self.args = args
    def run(self):
       self.func(*self.args)


class Predictor(MyEngine):
     def __init__(self, engine_paths, classes_paths, source, imgsz, img_label, vid_size, ...):
         super(Predictor, self).__init__(engine_paths, classes_paths, source, imgsz, img_label, vid_size, ...)

pred = Predictor(engine_paths=engines_paths, classes_paths=classes_paths, ...)

mainThread = myThread(pred.inference, [])
mainThread.daemon = True 
mainThread.start()

I have done that after several trial and error and searching in many documents and codes, I did not understand why it did work, but it did work ! I’m still learning so can any one please explain to me why it worked and thank you.

AakankshaS · November 27, 2023, 7:46am

unknown

Topic		Replies	Views
how to run trt in multithreading？ Jetson TX2	15	7972	October 18, 2021
Tensorrt multiple process TensorRT tensorrt	2	1567	February 21, 2024
Can TensorRT do inference in a child thread ? TensorRT	6	2205	August 11, 2020
How to use TensorRT by the multi-threading package of python Jetson AGX Xavier tensorrt	13	18740	October 18, 2021
TensorRT Parallel Inference /concurrent inferecing TensorRT tensorrt	10	4091	October 13, 2022
How to run inference in multithread( only Allocate host and device buffers once for all execution contexts) CUDA-GDB tensorrt , cuda	2	872	July 13, 2021
Threading on TX2 using Tensorrt TensorRT	1	500	May 20, 2020
How to run inference in multithread( only Allocate host and device buffers once for all execution contexts) TensorRT tensorrt	1	460	July 14, 2021
Latency when running TensorRT engine on two GPU TensorRT	9	1239	August 24, 2020
Thread safe while use tensorRT TensorRT	1	2608	March 25, 2019

How to use the multiple execution contexts of trt engine in parallel? How to properly use multi threading?

Related topics