Error after successfully DLA engine inference

We can inference and measure the speed successfully on DLA TensorRT engine, but an error emerges after inference.


Do you run it with a customized sample?
If yes, please remember to release or destroy the TensorRT engine.

For example, with the python sample:


if __name__ == "__main__":
    engine = PrepareEngine()

    engine = []


Why TensorRT engine is need to be released or destroyed? You mean I should release or destroy the TensorRT engine after Inference?

after adding engine=, the error also exists, what’s the reason of the error? It seems to have on influence on inference


Would you mind sharing a reproducible source with us first?
Thanks. (8.0 KB) (2.1 KB)

our.engine (1.5 MB)
jetpack 4.6.1


We check your source but meet some error related to pycuda.

$ python3
Reading engine from file resnet.engine
Traceback (most recent call last):
  File "", line 61, in <module>
    trt_outputs = common.do_inference_v2(context, bindings=bindings, inputs=inputs, outputs=outputs, stream=stream)
  File "/home/nvidia/topic_220555/", line 180, in do_inference_v2
    [cuda.memcpy_htod_async(inp.device,, stream) for inp in inputs]
  File "/home/nvidia/topic_220555/", line 180, in <listcomp>
    [cuda.memcpy_htod_async(inp.device,, stream) for inp in inputs]
pycuda._driver.LogicError: cuMemcpyHtoDAsync failed: invalid argument

Could you double-check the sample again?
Is the script require a specified pycuda version?


This error maybe result from the input tensor dimension. (1.4 MB)
You can try it again with python 1 480_int8_dla0.engine. Thank you!


Please try the following to release the TensorRT context.

diff --git a/ b/
index 57dab9a..6703aaa 100644
--- a/
+++ b/
@@ -77,5 +77,6 @@ while count > 0:
 cost = time.time()-st
 print ('count =', num)
 print ('cost: ', cost, cost / num)
+context = []
 #engine = []
 #print("====preds====:", trt_outputs[0].shape)


1 Like

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.