Running nvidia pretrained models in Tensorrt inference

Hello, I Successfully converted the Lprnet pretrained model (.etlt) to .trt using tao-converter tool.
I ham trying to run inference on the engine file but I am getting the same results as shown in this post:

I understand I need to process the output and map it to the correct characters but how can I get the bounding boxes of the detection? Where can I find the steps to deploy in tensorrt?

Hi,

This looks like a TAO Toolkit related issue. We will move this post to the TAO Toolkit forum.

Thanks!

To detect the bbox of license plate, please use another model (LPDnet) . Refer to LPDNet — TAO Toolkit 3.22.05 documentation

What is the command to convert the model to tensorrt? For example for LPRNet I use:

tao-converter <etlt_model> -k <key_to_etlt_model> -p image_input,1x3x48x96,4x3x48x96,16x3x48x96 -e <path_to_generated_trt_engine>

I got this from Tao toolkit documentation but I cannot find the one for LPRNet.

Above command you shared is already for LPRNet.

I meant LPDNet but I figured it out is

tao-converter <etlt_model> -k <key_to_etlt_model> -p Input,1x3x480x640,4x3x480x640,16x3x480x640 -e <path_to_generated_trt_engine>

After -p why is the input name changing?

It is normal. Different models may have different input names.

Where can I find this information for the pretrained models? Also runing the LPDNet.trt model using the same code I used for LPRNet is giving me the following error:

context = trt_engine.create_execution_context()
AttributeError: 'NoneType' object has no attribute 'create_execution_context'

Usually you can find the info in model card. For LPDNet, see https://catalog.ngc.nvidia.com/orgs/nvidia/teams/tao/models/lpdnet

  1. If it is the one trained based on detectnet_v2 network, its
uff-input-blob-name=input_1
  1. If it is the one trained based on yolov4_tiny network, its
uff-input-blob-name=Input

Officially, for inference against detectnet_v2 network, see
https://docs.nvidia.com/tao/tao-toolkit/text/object_detection/detectnet_v2.html#using-inference-on-the-model
or
https://docs.nvidia.com/tao/tao-toolkit/text/object_detection/detectnet_v2.html#deploying-to-deepstream
or

And also, there is some non-official topics. For example, https://forums.developer.nvidia.com/t/run-peoplenet-with-tensorrt/128000/21

I cant find that info in the model card. For LPRnet is image_input and for LPDNet is just input… Can you indicate exactly where that is?

uff-input-blob-name # is not in my code

I am using the following code.

import os
import time

import cv2
#import matplotlib.pyplot as plt
import numpy as np
import pycuda.autoinit
import pycuda.driver as cuda
import tensorrt as trt
from PIL import Image
import pdb


class HostDeviceMem(object):
    def __init__(self, host_mem, device_mem):
        self.host = host_mem
        self.device = device_mem

    def __str__(self):
        return "Host:\n" + str(self.host) + "\nDevice:\n" + str(self.device)

    def __repr__(self):
        return self.__str__()


def load_engine(trt_runtime, engine_path):
    with open(engine_path, "rb") as f:
        engine_data = f.read()
    engine = trt_runtime.deserialize_cuda_engine(engine_data)
    return engine

# Allocates all buffers required for an engine, i.e. host/device inputs/outputs.
def allocate_buffers(engine, batch_size=-1):
    inputs = []
    outputs = []
    bindings = []
    stream = cuda.Stream()
    for binding in engine:
        # pdb.set_trace()
        size = trt.volume(engine.get_binding_shape(binding)) * batch_size
        dtype = trt.nptype(engine.get_binding_dtype(binding))
        # Allocate host and device buffers
        host_mem = cuda.pagelocked_empty(size, dtype)
        device_mem = cuda.mem_alloc(host_mem.nbytes)
        # Append the device buffer to device bindings.
        bindings.append(int(device_mem))
        # Append to the appropriate list.
        if engine.binding_is_input(binding):
            inputs.append(HostDeviceMem(host_mem, device_mem))
            print(f"input: shape:{engine.get_binding_shape(binding)} dtype:{engine.get_binding_dtype(binding)}")
        else:
            outputs.append(HostDeviceMem(host_mem, device_mem))
            print(f"output: shape:{engine.get_binding_shape(binding)} dtype:{engine.get_binding_dtype(binding)}")
    return inputs, outputs, bindings, stream



def do_inference(context, bindings, inputs, outputs, stream, batch_size=1):
    # Transfer input data to the GPU.
    [cuda.memcpy_htod_async(inp.device, inp.host, stream) for inp in inputs]
    # Run inference.
    context.execute_async(
        batch_size=batch_size, bindings=bindings, stream_handle=stream.handle
    )
    # Transfer predictions back from the GPU.
    [cuda.memcpy_dtoh_async(out.host, out.device, stream) for out in outputs]
    # Synchronize the stream
    stream.synchronize()
    # Return only the host outputs.
    return [out.host for out in outputs]
def do_inference_v2(context, bindings, inputs, outputs, stream):
    """do_inference_v2 (for TensorRT 7.0+)

    This function is generalized for multiple inputs/outputs for full
    dimension networks.
    Inputs and outputs are expected to be lists of HostDeviceMem objects.
    """
    # Transfer input data to the GPU.
    [cuda.memcpy_htod_async(inp.device, inp.host, stream) for inp in inputs]
    # Run inference.
    context.execute_async_v2(bindings=bindings, stream_handle=stream.handle)
    # Transfer predictions back from the GPU.
    [cuda.memcpy_dtoh_async(out.host, out.device, stream) for out in outputs]
    # Synchronize the stream
    stream.synchronize()
    # Return only the host outputs.
    return [out.host for out in outputs]

# TensorRT logger singleton
os.environ["CUDA_VISIBLE_DEVICES"] = "1"
TRT_LOGGER = trt.Logger(trt.Logger.WARNING)
trt_engine_path = "lpr_engine.trt"

trt_runtime = trt.Runtime(TRT_LOGGER)
# pdb.set_trace()
trt_engine = load_engine(trt_runtime, trt_engine_path)
# Execution context is needed for inference
context = trt_engine.create_execution_context()
# This allocates memory for network inputs/outputs on both CPU and GPU
inputs, outputs, bindings, stream = allocate_buffers(trt_engine)

# pdb.set_trace()
image = [cv2.imread("car.jpg")]

# image = cv2.resize(image, (96, 48))/255.0

# image = image.T


image = np.array([(cv2.resize(img, ( 96 , 48 )))/ 255.0 for img in image], dtype=np.float32)

image= image.transpose( 0 , 3 , 1 , 2 )


np.copyto(inputs[0].host, image.ravel())

input_shape = (1,3,48,96)
context.set_binding_shape(0, input_shape)

output = do_inference_v2(context, bindings=bindings, inputs=inputs, outputs=outputs, stream=stream)
print(output)


As mentioned above, please see LPDNet | NVIDIA NGC.
You can search and find “uff-input-blob-name”

Got it thanks

Still getting this error

context = trt_engine.create_execution_context()
AttributeError: 'NoneType' object has no attribute 'create_execution_context'

Works with LPRnet but not with LPDnet

Please refer to above.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.