Run PeopleNet with tensorrt

Hello,

We are trying to run Peoplenet pruned model with tensorrt 7 without deepstream.

Where can we find a good post-processing code for detectnet model?

We have tried some post processing functions:

Please refer to postprocess code which is exposed in C++ in /opt/nvidia/deepstream/deepstream/sources/libs/nvdsinfer_customparser/nvdsinfer_custombboxparser.cpp .

Thanks for your anwser,

is the pre-processing similar to a Faster Rcnn pre-processing?

Hello,

I have the same problem as original op. I am trying to adapt the code of nvdsinfer_custombboxparser.cpp to a simple python example, but still cannot properly parse the output.

Does peoplenet need any specific preprocessing? On the model’s page I have just found “Input: Color Images of resolution 960 X 544 X 3”. I have tried both as HWC and CHW, without normalization or with normalization to [0, 1] (but of course, when I am trying things randomly there are tons of options and I might have made an error somewhere).

Is the output actually formatted as (xmin, ymin, xmax, ymax), like it seems to be parsed in nvdsinfer_custombboxparser.cpp, or is it (xc, yc, w, h), as written on the model’s page?

The outputs’ channel order is also not clear to me. The model’s page lists it as 60x34x12, which would be gridH * gridW * c * 4. When I look at nvdsinfer_custombboxparser it seems to be parsed differently.

Is it possible to have a few more information about how to run the model in tensor rt?
Thanks a lot!

Hi,

I’m now able to have good bouding boxes, I use the postprocessing given by Morganh.
And as pre-processing:

  • BGR Images

  • Divide all the pixel value by 255

I got same result as in TLT inference However I have to put the confidence treshold at 0.3 in my code, and0.8 for the TLT inference.

I think i have missed something in the pre processing.

In detectnet_v2 pre-processing for 3 channels, please refer to below.
a = np.asarray(img).astype(np.float32)
a= a.transpose(2, 0, 1) / 255.0

1 Like

RGB or BGR ?

RGB.
But (H, W, C) → (C, H, W)

Hello @m.fiore, I am stuck in post-processing. Can you please share your python code of post_processing? Thanks in advance.

Hi @cogbot here are some snippets of code. Hope this helps.

model_h = 544
model_w = 960
stride = 16
box_norm = 35.0

grid_h = int(model_h / stride)
grid_w = int(model_w / stride)
grid_size = grid_h * grid_w

grid_centers_w = []
grid_centers_h = []

for i in range(grid_h):
    value = (i * stride + 0.5) / box_norm
    grid_centers_h.append(value)

for i in range(grid_w):
    value = (i * stride + 0.5) / box_norm
    grid_centers_w.append(value)


def applyBoxNorm(o1, o2, o3, o4, x, y):
    """
    Applies the GridNet box normalization
    Args:
        o1 (float): first argument of the result
        o2 (float): second argument of the result
        o3 (float): third argument of the result
        o4 (float): fourth argument of the result
        x: row index on the grid
        y: column index on the grid

    Returns:
        float: rescaled first argument
        float: rescaled second argument
        float: rescaled third argument
        float: rescaled fourth argument
    """
    o1 = (o1 - self.grid_centers_w[x]) * -self.box_norm
    o2 = (o2 - self.grid_centers_h[y]) * -self.box_norm
    o3 = (o3 + self.grid_centers_w[x]) * self.box_norm
    o4 = (o4 + self.grid_centers_h[y]) * self.box_norm
    return o1, o2, o3, o4


def postprocess(outputs, min_confidence, analysis_classes):
    """
    Postprocesses the inference output
    Args:
        outputs (list of float): inference output
        min_confidence (float): min confidence to accept detection
        analysis_classes (list of int): indices of the classes to consider

    Returns: list of list tuple: each element is a two list tuple (x, y) representing the corners of a bb
        """
            
    bbs = []
    for c in range(len(classes)):
        if c not in analysis_classes:
            continue

        x1_idx = (c * 4 * grid_size)
        y1_idx = x1_idx + grid_size
        x2_idx = y1_idx + grid_size
        y2_idx = x2_idx + grid_size

        boxes = outputs[0]
        for h in range(grid_h):
            for w in range(grid_w):
                i = w + h * grid_w
                if outputs[1][c * grid_size + i] >= min_confidence:
                    o1 = boxes[x1_idx + w + h * grid_w]
                    o2 = boxes[y1_idx + w + h * grid_w]
                    o3 = boxes[x2_idx + w + h * grid_w]
                    o4 = boxes[y2_idx + w + h * grid_w]

                    o1, o2, o3, o4 = applyBoxNorm(
                        o1, o2, o3, o4, w, h)

                    xmin = int(o1)
                    ymin = int(o2)
                    xmax = int(o3)
                    ymax = int(o4)
                    bbs.append([(xmin, ymin), (xmax, ymax)])
    return bbs
3 Likes

I am sure it will help. I really appreciate it. Thank you so much, @m.fiore.

Hi. Did you resolve pre- and post- processing issues? Could you please share full example of inference detectnet using python? Thanks in advance.

@m.fiore @Morganh
Hi,
I now use PeopleNet ,and change it to TensorRT type (An .engine file).But I’am confused with the output of the model. Tensor-name(output_bbox/BiasAdd ) with shape (12, 34, 60) and tensor-name(output_cov/Sigmoid) with
shape (3, 34, 60). I can’t parse the output_bbox/BiasAdd to get the BBox(left,top,right,bottom).Could you please help me and tell me how to reslove it?Thanks

Please refer to postprocess code which is exposed in C++ in /opt/nvidia/deepstream/deepstream/sources/libs/nvdsinfer_customparser/nvdsinfer_custombboxparser.cpp .

More info can be found in

1 Like

@Morganh Thanks for reply. And I have read the C++ program but not fully understand.Is there a python program for it? Thanks.Or is there a way to use the C++ library?I have installed the latest DeepStream SDK in my Jetson Nano,compile the library but I cannot find how to use the library.I run the DeepStream PeopleNet example but I cannot find the codes which parse the output.

There is not official python version of postprocessing. You can find some useful info above or the links I mentioned.
For c++ in deepstream, please directly run GitHub - NVIDIA-AI-IOT/deepstream_tao_apps: Sample apps to demonstrate how to deploy models trained with TAO on DeepStream

1 Like

Thanks.

@Morganh Is there an article about the PeopleNet Model structure?

Peoplenet is actually detectnet_v2.
https://developer.nvidia.com/blog/training-custom-pretrained-models-using-tlt/

1 Like

Thanks.