Hi all,
Purpose: So far I need to put the TensorRT in the second threading.
I have read this document but I still have no idea how to exactly do TensorRT part on python.
I already have a sample which can successfully run on TRT.
Now I just want to run a really simple multi-threading code with TensorRT.
(I have done to generate the TensorRT engine, so I will load an engine and do TensorRT inference by multi-threading.)
Here is my code below. (Without the Tensorrt code)
import threading
import time
from my_tensorrt_code import TRTInference, trt
exitFlag = 0
class myThread(threading.Thread):
def __init__(self, func, args):
threading.Thread.__init__(self)
self.func = func
self.args = args
def run(self):
print ("Starting " + self.args[0])
self.func(*self.args)
print ("Exiting " + self.args[0])
if __name__ == '__main__':
# Create new threads
'''
format thread:
- func: function names, function that we wished to use
- arguments: arguments that will be used for the func's arguments
'''
trt_engine_path = './tensorrt_engine.trt'
max_batch_size = 1
trt_inference_wrapper = TRTInference(trt_engine_path,
trt_engine_datatype=trt.DataType.FLOAT,
batch_size=max_batch_size)
# Get TensorRT SSD model output
input_img_path = './testimage.png'
thread1 = myThread(trt_inference_wrapper.infer, [input_img_path])
# Start new Threads
thread1.start()
thread1.join()
print ("Exiting Main Thread")
The part of TRTInference code is quite similar with TensorRT which provided the uff_ssd
sample of Python part.
However, when I run this code, I always got this error messages on many platform (Desktop, TX2, and AGX).
[TensorRT] ERROR: ../rtSafe/cuda/caskConvolutionRunner.cpp (290) - Cask Error in checkCaskExecError<false>: 7 (Cask Convolution execution)
[TensorRT] ERROR: FAILED_EXECUTION: std::exception
I tracked the code and I found that this error message would get error during doing the do_inference
function.
def do_inference(context, bindings, inputs, outputs, stream, batch_size=1):
[cuda.memcpy_htod_async(inp.device, inp.host, stream) for inp in inputs]
context.execute_async(batch_size=batch_size, bindings=bindings, stream_handle=stream.handle)
[cuda.memcpy_dtoh_async(out.host, out.device, stream) for out in outputs]
stream.synchronize()
return [out.host for out in outputs]
Could you share me some suggestions that how to fix this error?
This error happened not only on desktop but also on Jetson devices…
Thank you so much!
Best regards,
Chieh
Reference: