Validation with engine file, that gives 2 outputs

nikhil.chowdary · July 22, 2021, 1:39pm

Hi,
My onnx gives 2 outputs and I converted that file to an engine file using the trtexec command line.
Now I need to validate the accuracy using this engine file. I already have the working validation code for the single output, can I know where to modify the code that supports 2 outputs.

code:

print("\n---Dataset Load Done---")

# Engine/Context Creation
with open(args.engine_input, 'rb') as f, trt.Runtime(TRT_LOGGER) as runtime:
    engine = runtime.deserialize_cuda_engine(f.read())

print("---Engine Creation Done---")

context = engine.create_execution_context()

print("---Context Creation Done---")


# Memory buffers
for binding in engine:
    if engine.binding_is_input(binding):  
        input_shape = engine.get_binding_shape(binding)
        print("binding shape:",input_shape)
        input_size = trt.volume(input_shape) * args.batch_size * np.dtype(np.float32).itemsize  
        #input_size = trt.volume(input_shape) * engine.max_batch_size * np.dtype(np.float16).itemsize
        device_input = cuda.mem_alloc(input_size)
        
    else:  
        output_shape = engine.get_binding_shape(binding)
        # Create page-locked memory buffers (i.e. won't be swapped to disk)
        host_output = cuda.aligned_empty(tuple(output_shape), dtype=np.float32, order='C')
        device_output = cuda.mem_alloc(host_output.nbytes)
          

batch_time = AverageMeter()
top1 = AverageMeter()
top5 = AverageMeter()
stream = cuda.Stream()
time_array = []
loop_counter = 0

for i, (input, target) in enumerate(loader):
    print("\n--------------------------\n")

    #tracemalloc.start()
    

    host_input = np.array(input, dtype=np.float32, order='C')
    
    #stream = cuda.Stream()
    #print("---Stream Created---")
    start_process = time.time()
    cuda.memcpy_htod_async(device_input, host_input, stream)
    #print("---Copied to Device---")
    
    # Inference
    #print(type(device_input))
    context.execute_async(args.batch_size, bindings=[int(device_input), int(device_output)], stream_handle=stream.handle)
    #print("---Inference Done---")
    cuda.memcpy_dtoh_async(host_output, device_output, stream)
    stream.synchronize()
    end_process = time.time()        
    time_array.append(end_process - start_process)

NVES · July 22, 2021, 2:08pm

Hi,
Request you to share the ONNX model and the script if not shared already so that we can assist you better.
Alongside you can try few things:

validating your model with the below snippet

check_model.py

import sys
import onnx
filename = yourONNXmodel
model = onnx.load(filename)
onnx.checker.check_model(model).
2) Try running your model with trtexec command.
https://github.com/NVIDIA/TensorRT/tree/master/samples/opensource/trtexec
In case you are still facing issue, request you to share the trtexec “”–verbose"" log for further debugging
Thanks!

nikhil.chowdary · July 22, 2021, 3:55pm

Thank you for the reply.

At first, I have created the ONNX named initial.onnx from this repo (GitHub - facebookresearch/detr: End-to-End Object Detection with Transformers) and tried to convert this ONNX into an engine file using the trtexec command line.
When I tried this, I got few errors saying builtin_op_importers. I have seen this same error in another issue ticket and they suggested to use onnx-simplifier. By using this onnx-simplifier, I generated a simplified.onnx file.
By using this simplified.onnx, I was able to create an int8 engine file without any errors using trtexec. The engine was named as detr.engine
Now I need to validate this detr.engine file and get accuracy values on the COCO dataset.

I have attached all the files in this drive link TensorRT - Google Drive

And, I have tried the check_model.py by passing the onnx file and the code was passed without any errors or warnings.

Thank you

spolisetty · July 26, 2021, 2:23pm

Hi @nikhil.chowdary,

As you would like to get multiple outputs, hope following similar thread may help you.

Thank you.