Inference on LPDNet onnx file

Hi, I am currently trying to use the LPDNet in my environment. I have downloaded the LPDNet_usa_pruned_tao5.onnx file and understood it is based on detectnet_v2.

I cannot, however, figure out how to run the inference using onnxruntime in python, and specifically process the outputs…

I tried following the python script from this topic: Run PeopleNet with tensorrt - #21 by carlos.alvarez

But it seems as the trt engine return a continuous array that the onnx does not.

This is my preprocessing function:

image = Image.fromarray(np.uint8(arr))

image_resized = image.resize(size=(self.model_w, self.model_h), resample=Image.BILINEAR)
img_np = np.array(image_resized, dtype=np.float32)


img_np = img_np.transpose((2, 0, 1))

Normalize to [0.0, 1.0] interval (expected by model)

img_np = (1.0 / 255.0) * img_np
img_np = np.expand_dims(img_np, axis=0)
return img_np

I figured the modle input is (1, 3, 480, 640). In addition the arr variable is an RGB image (read through cv2 and converted from BGR to RGB).

I inference using:

input_dict = {}
input_dict[] = data
outs =, input_dict)
return outs

( = ‘inputs_1:0’ and self.outputs = None)

the outs i get is a list consisting of:
[np.ndarray(shape=(1,1,30,40)), np.ndarray(shape=(1,4,30,40))]

I’m guessing the first array is the confidences and the second is the boxes. But how do i postprocess this into actual results…? In addition the maximum confidence I get is 0.000146…

Thanks alot.

You can refer to tao-tf1 branch in tao_tensorflow1_backend/nvidia_tao_tf1/cv/detectnet_v2/scripts/ at main · NVIDIA/tao_tensorflow1_backend · GitHub or tao-deploy branch in tao_deploy/nvidia_tao_deploy/cv/detectnet_v2/scripts/ at main · NVIDIA/tao_deploy · GitHub.

So I don’t really seem to understand how i fit this part to whats happening with the onnx inference outputs… What would be in self.target_classes? I mean isn’t there only one class…? Also it seems input_cluster is a dict, what would i find in this dict and how do I use it with the output shapes the inference produces for me?

for classes in self.target_classes:
            input_cluster[classes]['bbox'] = self.abs_bbox_converter(input_cluster[classes]
            # Stack predictions
            for keys in list(input_cluster[classes].keys()):
                if 'bbox' in keys:
                    input_cluster[classes][keys] = \
                        input_cluster[classes][keys][np.newaxis, :, :, :, :]
                    input_cluster[classes][keys] = \
                        np.asarray(input_cluster[classes][keys]).transpose((1, 2, 3, 4, 0))
                elif 'cov' in keys:
                    input_cluster[classes][keys] = input_cluster[classes][keys][np.newaxis,
                                                                                :, :, :]
                    input_cluster[classes][keys] = \
                        np.asarray(input_cluster[classes][keys]).transpose((2, 1, 3, 4, 0))

        return input_cluster

If you only train one class(lpd), yes, it is only needed to set lpd in the training spec file.

Suggest you to check more in DetectNet_v2 - NVIDIA Docs.

DetectNet_v2 generates 2 tensors, cov and bbox. The image is divided into 16x16 grid cells. The cov tensor (short for “coverage” tensor) defines the number of grid cells that are covered by an object. The bbox tensor defines the normalized image coordinates of the object top left (x1, y1) and bottom right (x2, y2) with respect to the grid cell. For best results, you can assume the coverage area to be an ellipse within the bbox label with the maximum confidence assigned to the cells in the center and reducing coverage outwards. Each class has its own coverage and bbox tensor, thus the shape of the tensors are as follows:

cov: Batch_size, Num_classes, image_height/16, image_width/16

bbox: Batch_size, Num_classes * 4, image_height/16, image_width/16 (where 4 is the number of coordinates per cell)

DetectNet_v2 - NVIDIA Docs.

The post-processor module generates renderable bounding boxes from the raw detection output. The process includes the following:

Filtering out valid detections by thresholding objects using the confidence value in the coverage tensor.

Clustering the raw filtered predictions using DBSCAN to produce the final rendered bounding boxes.

Filtering out weaker clusters based on the final confidence threshold derived from the candidate boxes that get grouped into a cluster.

You can also take a look at tao-toolkit-triton-apps/tao_triton/python/postprocessing/ at main · NVIDIA-AI-IOT/tao-toolkit-triton-apps · GitHub as well.

Thanks so much for the reply.

I understand now what I need to run in order to postprocess. But I’m getting stuck with the postprocessor_config.proto file…

What is supposed to be wwritten here? because i tried using this file: tao-toolkit-triton-apps/tao_triton/python/proto/postprocessor_config.proto at 5bb2ddbe0f8ef13fe534ab495a626d27e0ee7d03 · NVIDIA-AI-IOT/tao-toolkit-triton-apps · GitHub

and it doesn’t seem to work…

Also, what do you mean by the training spec file mentioned here?

Thanks again.

Suggest you to run detectnet_v2 notebook firstly to get familiar with the default process we shared. See the notebook and training spec files in tao_tutorials/notebooks/tao_launcher_starter_kit/detectnet_v2 at main · NVIDIA/tao_tutorials · GitHub.

You can generate tensorrt engine and also can debug inside the docker.
To check how the postprocessing works.
$ docker run --runtime=nvidia -it --rm /bin/bash

Its /usr/local/lib/python3.8/dist-packages/nvidia_tao_tf1/cv/detectnet_v2/postprocessor/ is the same as
tao_deploy/nvidia_tao_deploy/cv/detectnet_v2/ at main · NVIDIA/tao_deploy · GitHub.

Hey thanks for the reply,
since my last message I’ve managed to run the model (onnx) and get results! But I had to do a couple of things. One of them was to skip the dbscan part, as it asks for a dbscan_min_samples parameter >1 and in your proto file you mention it has to be between 0-1. When I tried changing it to like 1, it didn’t produce any candidates…

The next thing I did was lower the confidences to like 0.002 to manage to get 2 detections… do you have any idea why my confidences are so low?

In the end, from a picture with 3 cars, I got 2 bounding boxes on only one of the license plates, both are not quite tight - this makes me believe that the dbscan is used as some sort of NMS, which right now I’m skipping.

This is my postprocessor class, I’ve made some changes to the original code you linked to;

class LPDNetPostprocessor(object):
    """Post processor for LPDNet ONNX outputs."""

    def __init__(self, batch_size, frames,
                 output_path, data_format, classes, target_shape):
        """Initialize a post processor class for a classification model.
            batch_size (int): Number of images in the batch.
            frames (list): List of images.
            output_path (str): Unix path to the output rendered images and labels.
            data_format (str): Order of the input model dimensions.
                "channels_first": CHW order.
                "channels_last": HWC order.
            classes (list): List of the class names.
            postprocessing_config (proto): Configuration elements of the dbscan postprocessor.
            target_shape (tuple): Shape of the model input.
        # self.pproc_config = load_clustering_config(postprocessing_config)
        self.classes = classes
        self.output_names = ["output_cov/Sigmoid:0",
        self.bbox_norm = [35., 35]
        self.offset = 0.5
        self.scale_h = 1
        self.scale_w = 1
        self.target_shape = target_shape
        self.stride = 16
        self.linewidth = 4
        # super().__init__(batch_size, frames, output_path, data_format)
        self.batch_size = batch_size
        self.frames = frames
        self.output_path = output_path
        self.data_format = data_format
        if not os.path.exists(self.output_path):
        self.initialized = True
        # Format the dbscan elements into classwise configurations for rendering.

    def configure(self):
        """Configure the post processor object."""
        self.dbscan_elements = {}
        self.coverage_thresholds = {}
        self.box_color = {}
        self.classwise_clustering_config = {
            "LicensePlate": {
                'coverage_threshold': 0.002,
                'minimum_bounding_box_height': 4,
                'dbscan_config': {
                    'dbscan_eps': 0.3,
                    'dbscan_min_samples': 1,
                    'dbscan_confidence_threshold': 0.002
                    'R': 0,
                    'G': 255,
                    'B': 0
        for class_name in self.classes:
            if class_name not in self.classwise_clustering_config.keys():
                raise KeyError("Cannot find class name {} in {}".format(
                    class_name, self.classwise_clustering_config.keys()
            self.dbscan_elements[class_name] = dbscan(
            self.coverage_thresholds[class_name] = self.classwise_clustering_config[class_name]['coverage_threshold']
            self.box_color[class_name] = self.classwise_clustering_config[class_name]['bbox_color']

    def apply(self, results, this_id, render=True):
        """Apply the post processing to the outputs tensors.
        This function takes the raw output tensors from the detectnet_v2 model
        and performs the following steps:

        1. Denormalize the output bbox coordinates
        2. Threshold the coverage output to get the valid indices for the bboxes.
        3. Filter out the bboxes from the "output_bbox/BiasAdd" blob.
        4. Cluster the filterred boxes using DBSCAN.
        5. Render the outputs on images and save them to the output_path/images
        6. Serialize the output bboxes to KITTI Format label files in output_path/labels.
        output_array = {}
        this_id = int(this_id)
        for i, output_name in enumerate(self.output_names):
            output_array[output_name] = results[i].transpose(0, 1, 3, 2)
        assert len(self.classes) == output_array["output_cov/Sigmoid:0"].shape[1], (
            "Number of classes {} != number of dimensions in the output_cov/Sigmoid: {}".format(
                len(self.classes), output_array["output_cov/Sigmoid:0"].shape[1]
        abs_bbox = denormalize_bounding_bboxes(
            output_array["output_bbox/BiasAdd:0"], self.stride,
            self.offset, self.bbox_norm, len(self.classes), self.scale_w,
            self.scale_h, self.data_format, self.target_shape, self.frames,
            this_id - 1
        valid_indices = thresholded_indices(
            output_array["output_cov/Sigmoid:0"], len(self.classes),
        batchwise_boxes = []
        for image_idx, indices in enumerate(valid_indices):
            covs = output_array["output_cov/Sigmoid:0"][image_idx, :, :, :]
            bboxes = abs_bbox[image_idx, :, :, :]
            imagewise_boxes = []
            for class_idx in range(len(self.classes)):
                clustered_boxes = []
                cw_config = self.classwise_clustering_config[
                classwise_covs = covs[class_idx, :, :].flatten()
                classwise_covs = classwise_covs[indices[class_idx]]
                if classwise_covs.size == 0:
                classwise_bboxes = bboxes[4*class_idx:4*class_idx+4, :, :]
                classwise_bboxes = classwise_bboxes.reshape(
                    classwise_bboxes.shape[:1] + (-1,)
                pairwise_dist = \
                    1.0 * (1.0 - iou_vectorized(classwise_bboxes))
                # labeling = self.dbscan_elements[self.classes[class_idx]].fit_predict(
                #     X=pairwise_dist,
                #     sample_weight=classwise_covs
                # )
                labeling = np.asarray(range(len(classwise_covs)))
                labels = np.unique(labeling[labeling >= 0])
                for label in labels:
                    w = classwise_covs[labeling == label]
                    aggregated_w = np.sum(w)
                    w_norm = w / aggregated_w
                    n = len(w)
                    w_max = np.max(w)
                    w_min = np.min(w)
                    b = classwise_bboxes[labeling == label]
                    mean_bbox = np.sum((b.T*w_norm).T, axis=0)

                    # Compute coefficient of variation of the box coords
                    mean_box_w = mean_bbox[2] - mean_bbox[0]
                    mean_box_h = mean_bbox[3] - mean_bbox[1]
                    bbox_area = mean_box_w * mean_box_h
                    valid_box = aggregated_w > cw_config['dbscan_config']['dbscan_confidence_threshold'] \
                        and mean_box_h > cw_config['minimum_bounding_box_height']
                    if valid_box:
                                self.classes[class_idx], 0, 0, 0,
                                mean_bbox, 0, 0, 0, 0,
                                0, 0, 0, confidence_score=aggregated_w

        if render:
            processes = []
            with pool_context(self.batch_size) as pool:
                for image_idx in range(self.batch_size):
                    current_idx = (this_id - 1) * self.batch_size + image_idx
                    if current_idx >= len(self.frames):
                    current_frame = self.frames[current_idx]
                    filename = os.path.basename('tmp.png')
                    output_label_file = os.path.join(
                        self.output_path, "infer_labels",
                    output_image_file = os.path.join(
                        self.output_path, "infer_images",
                    if not os.path.exists(os.path.dirname(output_label_file)):
                    if not os.path.exists(os.path.dirname(output_image_file)):
                            write_kitti_annotation, (output_label_file, batchwise_boxes[image_idx])
                            (current_frame, batchwise_boxes[image_idx],
                             output_image_file, self.box_color,
                for p in processes:

There is no update from you for a period, assuming this is not an issue anymore. Hence we are closing this topic. If need further support, please open a new one. Thanks

Could you re-export the onnx by adding --onnx_route tf2onnx in the command?

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.