Hi @aravind.chakravarti ,
Please check the sample present in the link, which Implements a full ONNX-based pipeline for performing inference with the YOLOv3-608 network, including pre and post-processing.
I could extract the below code inside the onnnx_to_tensorrt.py file.
Let me check with my code and if I get any roadblocks I will post in the forum.
with get_engine(onnx_file_path, engine_file_path) as engine, engine.create_execution_context() as context:
inputs, outputs, bindings, stream = common.allocate_buffers(engine)
# Do inference
print("Running inference on image {}...".format(input_image_path))
# Set host input to the image. The common.do_inference function will copy the input to the GPU before executing.
inputs[0].host = image
trt_outputs = common.do_inference_v2(context, bindings=bindings, inputs=inputs, outputs=outputs, stream=stream)
# Before doing post-processing, we need to reshape the outputs as the common.do_inference will give us flat arrays.
trt_outputs = [output.reshape(shape) for output, shape in zip(trt_outputs, output_shapes)]
Those who are here looking for a working example,
I have uploaded my Hello World example code to GitHub. It is a MNIST classifier, in which I am running inference using,