Run PeopleNet with tensorrt

steventel · June 16, 2020, 6:53am

Hello,

We are trying to run Peoplenet pruned model with tensorrt 7 without deepstream.

Where can we find a good post-processing code for detectnet model?

We have tried some post processing functions:

dusty-nv/jetson-inference/blob/master/c/detectNet.cpp

/*
 * Copyright (c) 2017, NVIDIA CORPORATION. All rights reserved.
 *
 * Permission is hereby granted, free of charge, to any person obtaining a
 * copy of this software and associated documentation files (the "Software"),
 * to deal in the Software without restriction, including without limitation
 * the rights to use, copy, modify, merge, publish, distribute, sublicense,
 * and/or sell copies of the Software, and to permit persons to whom the
 * Software is furnished to do so, subject to the following conditions:
 *
 * The above copyright notice and this permission notice shall be included in
 * all copies or substantial portions of the Software.
 *
 * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
 * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
 * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
 * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
 * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
 * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
 * DEALINGS IN THE SOFTWARE.

This file has been truncated. show original

github.com

AastaNV/DeepStream/blob/master/parser_detectnet/nvparsebbox.cpp

/**
 * Copyright (c) 2018, NVIDIA CORPORATION.  All rights reserved.
 *
 * NVIDIA Corporation and its licensors retain all intellectual property
 * and proprietary rights in and to this software, related documentation
 * and any modifications thereto.  Any use, reproduction, disclosure or
 * distribution of this software and related documentation without an express
 * license agreement from NVIDIA Corporation is strictly prohibited.
 *
 */


#include "nvparsebbox.h"
#include <iostream>

/* detectnet */
void parse_bbox_custom_detectnet(DimsCHW outputDims, DimsCHW outputDimsBBOX,
    vector<cv::Rect> *rectList, int class_num, int batch_th, int net_width, int net_height,
    float *output_cov_buf,float *output_bbox_buf, float *classthreshold)
{

This file has been truncated. show original

Morganh · June 16, 2020, 7:03am

Please refer to postprocess code which is exposed in C++ in /opt/nvidia/deepstream/deepstream/sources/libs/nvdsinfer_customparser/nvdsinfer_custombboxparser.cpp .

steventel · June 16, 2020, 7:44am

Thanks for your anwser,

is the pre-processing similar to a Faster Rcnn pre-processing?

m.fiore · June 16, 2020, 2:59pm

Hello,

I have the same problem as original op. I am trying to adapt the code of nvdsinfer_custombboxparser.cpp to a simple python example, but still cannot properly parse the output.

Does peoplenet need any specific preprocessing? On the model’s page I have just found “Input: Color Images of resolution 960 X 544 X 3”. I have tried both as HWC and CHW, without normalization or with normalization to [0, 1] (but of course, when I am trying things randomly there are tons of options and I might have made an error somewhere).

Is the output actually formatted as (xmin, ymin, xmax, ymax), like it seems to be parsed in nvdsinfer_custombboxparser.cpp, or is it (xc, yc, w, h), as written on the model’s page?

The outputs’ channel order is also not clear to me. The model’s page lists it as 60x34x12, which would be gridH * gridW * c * 4. When I look at nvdsinfer_custombboxparser it seems to be parsed differently.

Is it possible to have a few more information about how to run the model in tensor rt?
Thanks a lot!

steventel · June 16, 2020, 3:06pm

Hi,

I’m now able to have good bouding boxes, I use the postprocessing given by Morganh.
And as pre-processing:

BGR Images
Divide all the pixel value by 255

I got same result as in TLT inference However I have to put the confidence treshold at 0.3 in my code, and0.8 for the TLT inference.

I think i have missed something in the pre processing.

Morganh · June 16, 2020, 4:29pm

In detectnet_v2 pre-processing for 3 channels, please refer to below.
a = np.asarray(img).astype(np.float32)
a= a.transpose(2, 0, 1) / 255.0

LoveNvidia · June 22, 2020, 1:07pm

RGB or BGR ?

Morganh · June 22, 2020, 2:23pm

RGB.
But (H, W, C) → (C, H, W)

cogbot · October 5, 2020, 12:57am

Hello @m.fiore, I am stuck in post-processing. Can you please share your python code of post_processing? Thanks in advance.

m.fiore · October 5, 2020, 7:48am

Hi @cogbot here are some snippets of code. Hope this helps.

model_h = 544
model_w = 960
stride = 16
box_norm = 35.0

grid_h = int(model_h / stride)
grid_w = int(model_w / stride)
grid_size = grid_h * grid_w

grid_centers_w = []
grid_centers_h = []

for i in range(grid_h):
    value = (i * stride + 0.5) / box_norm
    grid_centers_h.append(value)

for i in range(grid_w):
    value = (i * stride + 0.5) / box_norm
    grid_centers_w.append(value)


def applyBoxNorm(o1, o2, o3, o4, x, y):
    """
    Applies the GridNet box normalization
    Args:
        o1 (float): first argument of the result
        o2 (float): second argument of the result
        o3 (float): third argument of the result
        o4 (float): fourth argument of the result
        x: row index on the grid
        y: column index on the grid

    Returns:
        float: rescaled first argument
        float: rescaled second argument
        float: rescaled third argument
        float: rescaled fourth argument
    """
    o1 = (o1 - self.grid_centers_w[x]) * -self.box_norm
    o2 = (o2 - self.grid_centers_h[y]) * -self.box_norm
    o3 = (o3 + self.grid_centers_w[x]) * self.box_norm
    o4 = (o4 + self.grid_centers_h[y]) * self.box_norm
    return o1, o2, o3, o4


def postprocess(outputs, min_confidence, analysis_classes):
    """
    Postprocesses the inference output
    Args:
        outputs (list of float): inference output
        min_confidence (float): min confidence to accept detection
        analysis_classes (list of int): indices of the classes to consider

    Returns: list of list tuple: each element is a two list tuple (x, y) representing the corners of a bb
        """
            
    bbs = []
    for c in range(len(classes)):
        if c not in analysis_classes:
            continue

        x1_idx = (c * 4 * grid_size)
        y1_idx = x1_idx + grid_size
        x2_idx = y1_idx + grid_size
        y2_idx = x2_idx + grid_size

        boxes = outputs[0]
        for h in range(grid_h):
            for w in range(grid_w):
                i = w + h * grid_w
                if outputs[1][c * grid_size + i] >= min_confidence:
                    o1 = boxes[x1_idx + w + h * grid_w]
                    o2 = boxes[y1_idx + w + h * grid_w]
                    o3 = boxes[x2_idx + w + h * grid_w]
                    o4 = boxes[y2_idx + w + h * grid_w]

                    o1, o2, o3, o4 = applyBoxNorm(
                        o1, o2, o3, o4, w, h)

                    xmin = int(o1)
                    ymin = int(o2)
                    xmax = int(o3)
                    ymax = int(o4)
                    bbs.append([(xmin, ymin), (xmax, ymax)])
    return bbs

cogbot · October 5, 2020, 8:22am

I am sure it will help. I really appreciate it. Thank you so much, @m.fiore.

bumatov · November 23, 2020, 2:30pm

Hi. Did you resolve pre- and post- processing issues? Could you please share full example of inference detectnet using python? Thanks in advance.

patterson163 · December 7, 2020, 7:48am

@m.fiore @Morganh
Hi,
I now use PeopleNet ,and change it to TensorRT type (An .engine file).But I’am confused with the output of the model. Tensor-name(output_bbox/BiasAdd ) with shape (12, 34, 60) and tensor-name(output_cov/Sigmoid) with
shape (3, 34, 60). I can’t parse the output_bbox/BiasAdd to get the BBox(left,top,right,bottom).Could you please help me and tell me how to reslove it?Thanks

Morganh · December 7, 2020, 8:29am

Please refer to postprocess code which is exposed in C++ in /opt/nvidia/deepstream/deepstream/sources/libs/nvdsinfer_customparser/nvdsinfer_custombboxparser.cpp .

More info can be found in

patterson163 · December 7, 2020, 8:38am

@Morganh Thanks for reply. And I have read the C++ program but not fully understand.Is there a python program for it? Thanks.Or is there a way to use the C++ library?I have installed the latest DeepStream SDK in my Jetson Nano,compile the library but I cannot find how to use the library.I run the DeepStream PeopleNet example but I cannot find the codes which parse the output.

Morganh · December 7, 2020, 9:03am

There is not official python version of postprocessing. You can find some useful info above or the links I mentioned.
For c++ in deepstream, please directly run GitHub - NVIDIA-AI-IOT/deepstream_tao_apps: Sample apps to demonstrate how to deploy models trained with TAO on DeepStream

patterson163 · December 7, 2020, 9:08am

Thanks.

patterson163 · December 7, 2020, 9:11am

@Morganh Is there an article about the PeopleNet Model structure?

Morganh · December 7, 2020, 9:12am

Peoplenet is actually detectnet_v2.
https://developer.nvidia.com/blog/training-custom-pretrained-models-using-tlt/

patterson163 · December 7, 2020, 9:17am

Thanks.