Validation with engine file, that gives 2 outputs

Hi,
My onnx gives 2 outputs and I converted that file to an engine file using the trtexec command line.
Now I need to validate the accuracy using this engine file. I already have the working validation code for the single output, can I know where to modify the code that supports 2 outputs.

code:

print("\n---Dataset Load Done---")

# Engine/Context Creation
with open(args.engine_input, 'rb') as f, trt.Runtime(TRT_LOGGER) as runtime:
    engine = runtime.deserialize_cuda_engine(f.read())

print("---Engine Creation Done---")

context = engine.create_execution_context()

print("---Context Creation Done---")


# Memory buffers
for binding in engine:
    if engine.binding_is_input(binding):  
        input_shape = engine.get_binding_shape(binding)
        print("binding shape:",input_shape)
        input_size = trt.volume(input_shape) * args.batch_size * np.dtype(np.float32).itemsize  
        #input_size = trt.volume(input_shape) * engine.max_batch_size * np.dtype(np.float16).itemsize
        device_input = cuda.mem_alloc(input_size)
        
    else:  
        output_shape = engine.get_binding_shape(binding)
        # Create page-locked memory buffers (i.e. won't be swapped to disk)
        host_output = cuda.aligned_empty(tuple(output_shape), dtype=np.float32, order='C')
        device_output = cuda.mem_alloc(host_output.nbytes)
          

batch_time = AverageMeter()
top1 = AverageMeter()
top5 = AverageMeter()
stream = cuda.Stream()
time_array = []
loop_counter = 0

for i, (input, target) in enumerate(loader):
    print("\n--------------------------\n")

    #tracemalloc.start()
    

    host_input = np.array(input, dtype=np.float32, order='C')
    
    #stream = cuda.Stream()
    #print("---Stream Created---")
    start_process = time.time()
    cuda.memcpy_htod_async(device_input, host_input, stream)
    #print("---Copied to Device---")
    
    # Inference
    #print(type(device_input))
    context.execute_async(args.batch_size, bindings=[int(device_input), int(device_output)], stream_handle=stream.handle)
    #print("---Inference Done---")
    cuda.memcpy_dtoh_async(host_output, device_output, stream)
    stream.synchronize()
    end_process = time.time()        
    time_array.append(end_process - start_process)

Hi,
Request you to share the ONNX model and the script if not shared already so that we can assist you better.
Alongside you can try few things:
https://docs.nvidia.com/deeplearning/tensorrt/quick-start-guide/index.html#onnx-export

  1. validating your model with the below snippet

check_model.py

import sys
import onnx
filename = yourONNXmodel
model = onnx.load(filename)
onnx.checker.check_model(model).
2) Try running your model with trtexec command.

In case you are still facing issue, request you to share the trtexec “”–verbose"" log for further debugging
Thanks!

Thank you for the reply.

  1. At first, I have created the ONNX named initial.onnx from this repo (GitHub - facebookresearch/detr: End-to-End Object Detection with Transformers) and tried to convert this ONNX into an engine file using the trtexec command line.

  2. When I tried this, I got few errors saying builtin_op_importers. I have seen this same error in another issue ticket and they suggested to use onnx-simplifier. By using this onnx-simplifier, I generated a simplified.onnx file.

  3. By using this simplified.onnx, I was able to create an int8 engine file without any errors using trtexec. The engine was named as detr.engine
    Now I need to validate this detr.engine file and get accuracy values on the COCO dataset.

I have attached all the files in this drive link TensorRT - Google Drive

And, I have tried the check_model.py by passing the onnx file and the code was passed without any errors or warnings.
img1

Thank you

Hi @nikhil.chowdary,

As you would like to get multiple outputs, hope following similar thread may help you.

Thank you.