Pycuda Error During running infrence on tensorrt on Jetson Nano

I was successfully able to convert my darknet yolov3 model to tensorrt and i also was able to run the prediction once. But when i ran it again it giving this error.
i used example from
/usr/src/tensorrt/samples/python/yolov3_onnx/
Since my darknet is custom with only 1 number of class there are 18 filters and input shape of 416
So i changed output_shapes = [(1, 18, 13, 13), (1, 18, 26, 26], (1, 18, 52, 52)]

Traceback (most recent call last):
  File "onnx_to_tensorrt.py", line 190, in <module>
    main()
  File "onnx_to_tensorrt.py", line 166, in main
    trt_outputs = common.do_inference_v2(context, bindings=bindings, inputs=inputs, outputs=outputs, stream=stream)
  File "/home/experio/Documents/yolov3_onnx/common.py", line 191, in do_inference_v2
    [cuda.memcpy_htod_async(inp.device, inp.host, stream) for inp in inputs]
  File "/home/experio/Documents/yolov3_onnx/common.py", line 191, in <listcomp>
    [cuda.memcpy_htod_async(inp.device, inp.host, stream) for inp in inputs]
pycuda._driver.LogicError: cuMemcpyHtoDAsync failed: invalid argument

the error is in common file:

    def do_inference_v2(context, bindings, inputs, outputs, stream):
    # Transfer input data to the GPU.
    [cuda.memcpy_htod_async(inp.device, inp.host, stream) for inp in inputs]

Hi,

The error indicates the buffer size doesn’t align.
For a customized model, you may also need to update the postprocessing configure:

postprocessor_args = {"yolo_masks": [(6, 7, 8), (3, 4, 5), (0, 1, 2)],                    # A list of 3 three-dimensional tuples for the YOLO masks
                      "yolo_anchors": [(10, 13), (16, 30), (33, 23), (30, 61), (62, 45),  # A list of 9 two-dimensional tuples for the YOLO anchors
                                       (59, 119), (116, 90), (156, 198), (373, 326)],
                      "obj_threshold": 0.6,                                               # Threshold for object coverage, float value between 0 and 1
                      "nms_threshold": 0.5,                                               # Threshold for non-max suppression algorithm, float value between 0 and 1
                      "yolo_input_resolution": input_resolution_yolov3_HW}

The informaiton can be found in the cfg file.
Could you give it a check, and do the corresponding update?

Thanks.

Hi,

I have checked anchors and marks in the custom cfg file, they are same. After running the code, still get the same error.

trt_outputs = common.do_inference_v2(context, bindings=bindings, inputs=inputs, outputs=outputs, stream=stream)

in common.py
def do_inference_v2(context, bindings, inputs, outputs, stream):
# Transfer input data to the GPU.
[cuda.memcpy_htod_async(inp.device, inp.host, stream) for inp in inputs]

This line is showing error. And I also not understand the why it’s show out at first time and its popping error.

Thanks.

Hi,

Would you mind to share the customized model (.cfg & weights) and the modified source (if any) with us?
We want to reproduce this in our environment first.

Thanks.

Hi,
here is the weight
here is cfg

Thanks