Pycuda Error During running infrence on tensorrt on Jetson Nano

anantgupta129 · March 17, 2021, 12:15pm

I was successfully able to convert my darknet yolov3 model to tensorrt and i also was able to run the prediction once. But when i ran it again it giving this error.
i used example from
/usr/src/tensorrt/samples/python/yolov3_onnx/
Since my darknet is custom with only 1 number of class there are 18 filters and input shape of 416
So i changed output_shapes = [(1, 18, 13, 13), (1, 18, 26, 26], (1, 18, 52, 52)]

Traceback (most recent call last):
  File "onnx_to_tensorrt.py", line 190, in <module>
    main()
  File "onnx_to_tensorrt.py", line 166, in main
    trt_outputs = common.do_inference_v2(context, bindings=bindings, inputs=inputs, outputs=outputs, stream=stream)
  File "/home/experio/Documents/yolov3_onnx/common.py", line 191, in do_inference_v2
    [cuda.memcpy_htod_async(inp.device, inp.host, stream) for inp in inputs]
  File "/home/experio/Documents/yolov3_onnx/common.py", line 191, in <listcomp>
    [cuda.memcpy_htod_async(inp.device, inp.host, stream) for inp in inputs]
pycuda._driver.LogicError: cuMemcpyHtoDAsync failed: invalid argument

the error is in common file:

    def do_inference_v2(context, bindings, inputs, outputs, stream):
    # Transfer input data to the GPU.
    [cuda.memcpy_htod_async(inp.device, inp.host, stream) for inp in inputs]

AastaLLL · March 18, 2021, 2:56am

Hi,

The error indicates the buffer size doesn’t align.
For a customized model, you may also need to update the postprocessing configure:

postprocessor_args = {"yolo_masks": [(6, 7, 8), (3, 4, 5), (0, 1, 2)],                    # A list of 3 three-dimensional tuples for the YOLO masks
                      "yolo_anchors": [(10, 13), (16, 30), (33, 23), (30, 61), (62, 45),  # A list of 9 two-dimensional tuples for the YOLO anchors
                                       (59, 119), (116, 90), (156, 198), (373, 326)],
                      "obj_threshold": 0.6,                                               # Threshold for object coverage, float value between 0 and 1
                      "nms_threshold": 0.5,                                               # Threshold for non-max suppression algorithm, float value between 0 and 1
                      "yolo_input_resolution": input_resolution_yolov3_HW}

The informaiton can be found in the cfg file.
Could you give it a check, and do the corresponding update?

Thanks.

anantgupta129 · March 18, 2021, 4:58am

Hi,

I have checked anchors and marks in the custom cfg file, they are same. After running the code, still get the same error.

trt_outputs = common.do_inference_v2(context, bindings=bindings, inputs=inputs, outputs=outputs, stream=stream)

in common.py
def do_inference_v2(context, bindings, inputs, outputs, stream):
# Transfer input data to the GPU.
[cuda.memcpy_htod_async(inp.device, inp.host, stream) for inp in inputs]

This line is showing error. And I also not understand the why it’s show out at first time and its popping error.

Thanks.

AastaLLL · March 23, 2021, 8:51am

Hi,

Would you mind to share the customized model (.cfg & weights) and the modified source (if any) with us?
We want to reproduce this in our environment first.

Thanks.

anantgupta129 · March 23, 2021, 10:30am

Hi,
here is the weight
here is cfg

Thanks

AastaLLL · April 13, 2021, 12:19pm

Hi,

Since your input is (416,416), you will also need to update the input dimension:

diff --git a/onnx_to_tensorrt.py b/onnx_to_tensorrt.py
index c4fd70b..86b8fb4 100644
--- a/onnx_to_tensorrt.py
+++ b/onnx_to_tensorrt.py
@@ -113,7 +113,7 @@ def get_engine(onnx_file_path, engine_file_path=""):
                         print (parser.get_error(error))
                     return None
             # The actual yolov3.onnx is generated with batch size 64. Reshape input to batch size 1
-            network.get_input(0).shape = [1, 3, 608, 608]
+            network.get_input(0).shape = [1, 3, 416, 416]
             print('Completed parsing of ONNX file')
             print('Building an engine from file {}; this may take a while...'.format(onnx_file_path))
             engine = builder.build_cuda_engine(network)
@@ -141,7 +141,7 @@ def main():
         'https://github.com/pjreddie/darknet/raw/f86901f6177dfc6116360a13cc06ab680e0c86b0/data/dog.jpg', checksum_reference=None)

     # Two-dimensional tuple with the target network's (spatial) input resolution in HW ordered
-    input_resolution_yolov3_HW = (608, 608)
+    input_resolution_yolov3_HW = (416, 416)
     # Create a pre-processor object by specifying the required input resolution for YOLOv3
     preprocessor = PreprocessYOLO(input_resolution_yolov3_HW)
     # Load an image from the specified input path, and return it together with  a pre-processed version
@@ -150,7 +150,8 @@ def main():
     shape_orig_WH = image_raw.size

     # Output shapes expected by the post-processor
-    output_shapes = [(1, 255, 19, 19), (1, 255, 38, 38), (1, 255, 76, 76)]
+    output_shapes = [(1, 18, 13, 13), (1, 18, 26, 26), (1, 18, 52, 52)]
+
     # Do inference with TensorRT
     trt_outputs = []
     with get_engine(onnx_file_path, engine_file_path) as engine, engine.create_execution_context() as context:

However, we meet some error when converting your model.
Could you check if the output dimension is correct or not first?

$ python3 onnx_to_tensorrt.py
Reading engine from file yolov3.trt
Running inference on image dog.jpg...
Traceback (most recent call last):
  File "onnx_to_tensorrt.py", line 186, in <module>
    main()
  File "onnx_to_tensorrt.py", line 166, in main
    trt_outputs = [output.reshape(shape) for output, shape in zip(trt_outputs, output_shapes)]
  File "onnx_to_tensorrt.py", line 166, in <listcomp>
    trt_outputs = [output.reshape(shape) for output, shape in zip(trt_outputs, output_shapes)]
ValueError: cannot reshape array of size 6498 into shape (1,18,13,13)

Thanks.

Topic		Replies	Views
Tiny Yolo v3 in Python for Jetson Nano Jetson Nano	10	3155	March 19, 2020
How to infer using tensorRT on jetson nano? Jetson Nano tensorrt , deep-learning	4	1014	October 15, 2021
Error Code 1: Cask (Cask convolution execution) TensorRT tensorrt , cuda	3	1499	March 4, 2024
Implementing YoloV3 with tensorRT on the jetson Jetson TX2	7	5453	October 18, 2021
yolov3-tiny for parsing a onnx model: concat error TensorRT	9	3150	June 4, 2019
How to interpret yolox output tensor? TensorRT cudnn	2	906	January 23, 2024
Output incorrect with odd number of channels Jetson Nano	24	1318	October 15, 2021
ONNX Runtime Error: fp16 precision has been set for a layer or layer output, but fp16 is not configured in the builder Jetson Nano jetson-inference , onnx	3	2839	February 4, 2022
Getting error while using tensorRT engine Jetson Nano tensorrt	5	1700	October 15, 2021
Kernel weights has count 2304 but 32640 was expected Jetson TX2 tensorrt , nvbugs	23	4471	May 12, 2022

Pycuda Error During running infrence on tensorrt on Jetson Nano

Related topics