Hello,
We are trying to run Peoplenet pruned model with tensorrt 7 without deepstream.
Where can we find a good post-processing code for detectnet model?
We have tried some post processing functions:
Hello,
We are trying to run Peoplenet pruned model with tensorrt 7 without deepstream.
Where can we find a good post-processing code for detectnet model?
We have tried some post processing functions:
Please refer to postprocess code which is exposed in C++ in /opt/nvidia/deepstream/deepstream/sources/libs/nvdsinfer_customparser/nvdsinfer_custombboxparser.cpp .
Thanks for your anwser,
is the pre-processing similar to a Faster Rcnn pre-processing?
Hello,
I have the same problem as original op. I am trying to adapt the code of nvdsinfer_custombboxparser.cpp to a simple python example, but still cannot properly parse the output.
Does peoplenet need any specific preprocessing? On the modelâ€™s page I have just found â€śInput: Color Images of resolution 960 X 544 X 3â€ť. I have tried both as HWC and CHW, without normalization or with normalization to [0, 1] (but of course, when I am trying things randomly there are tons of options and I might have made an error somewhere).
Is the output actually formatted as (xmin, ymin, xmax, ymax), like it seems to be parsed in nvdsinfer_custombboxparser.cpp, or is it (xc, yc, w, h), as written on the modelâ€™s page?
The outputsâ€™ channel order is also not clear to me. The modelâ€™s page lists it as 60x34x12, which would be gridH * gridW * c * 4. When I look at nvdsinfer_custombboxparser it seems to be parsed differently.
Is it possible to have a few more information about how to run the model in tensor rt?
Thanks a lot!
Hi,
Iâ€™m now able to have good bouding boxes, I use the postprocessing given by Morganh.
And as pre-processing:
BGR Images
Divide all the pixel value by 255
I got same result as in TLT inference However I have to put the confidence treshold at 0.3 in my code, and0.8 for the TLT inference.
I think i have missed something in the pre processing.
In detectnet_v2 pre-processing for 3 channels, please refer to below.
a = np.asarray(img).astype(np.float32)
a= a.transpose(2, 0, 1) / 255.0
RGB or BGR ?
RGB.
But (H, W, C) â†’ (C, H, W)
Hello @m.fiore, I am stuck in post-processing. Can you please share your python code of post_processing? Thanks in advance.
Hi @cogbot here are some snippets of code. Hope this helps.
model_h = 544
model_w = 960
stride = 16
box_norm = 35.0
grid_h = int(model_h / stride)
grid_w = int(model_w / stride)
grid_size = grid_h * grid_w
grid_centers_w = []
grid_centers_h = []
for i in range(grid_h):
value = (i * stride + 0.5) / box_norm
grid_centers_h.append(value)
for i in range(grid_w):
value = (i * stride + 0.5) / box_norm
grid_centers_w.append(value)
def applyBoxNorm(o1, o2, o3, o4, x, y):
"""
Applies the GridNet box normalization
Args:
o1 (float): first argument of the result
o2 (float): second argument of the result
o3 (float): third argument of the result
o4 (float): fourth argument of the result
x: row index on the grid
y: column index on the grid
Returns:
float: rescaled first argument
float: rescaled second argument
float: rescaled third argument
float: rescaled fourth argument
"""
o1 = (o1 - self.grid_centers_w[x]) * -self.box_norm
o2 = (o2 - self.grid_centers_h[y]) * -self.box_norm
o3 = (o3 + self.grid_centers_w[x]) * self.box_norm
o4 = (o4 + self.grid_centers_h[y]) * self.box_norm
return o1, o2, o3, o4
def postprocess(outputs, min_confidence, analysis_classes):
"""
Postprocesses the inference output
Args:
outputs (list of float): inference output
min_confidence (float): min confidence to accept detection
analysis_classes (list of int): indices of the classes to consider
Returns: list of list tuple: each element is a two list tuple (x, y) representing the corners of a bb
"""
bbs = []
for c in range(len(classes)):
if c not in analysis_classes:
continue
x1_idx = (c * 4 * grid_size)
y1_idx = x1_idx + grid_size
x2_idx = y1_idx + grid_size
y2_idx = x2_idx + grid_size
boxes = outputs[0]
for h in range(grid_h):
for w in range(grid_w):
i = w + h * grid_w
if outputs[1][c * grid_size + i] >= min_confidence:
o1 = boxes[x1_idx + w + h * grid_w]
o2 = boxes[y1_idx + w + h * grid_w]
o3 = boxes[x2_idx + w + h * grid_w]
o4 = boxes[y2_idx + w + h * grid_w]
o1, o2, o3, o4 = applyBoxNorm(
o1, o2, o3, o4, w, h)
xmin = int(o1)
ymin = int(o2)
xmax = int(o3)
ymax = int(o4)
bbs.append([(xmin, ymin), (xmax, ymax)])
return bbs
Hi. Did you resolve pre- and post- processing issues? Could you please share full example of inference detectnet using python? Thanks in advance.
@m.fiore @Morganh
Hi,
I now use PeopleNet ,and change it to TensorRT type (An .engine file).But Iâ€™am confused with the output of the model. Tensor-name(output_bbox/BiasAdd ) with shape (12, 34, 60) and tensor-name(output_cov/Sigmoid) with
shape (3, 34, 60). I canâ€™t parse the output_bbox/BiasAdd to get the BBox(left,top,right,bottom).Could you please help me and tell me how to reslove it?Thanks
Please refer to postprocess code which is exposed in C++ in /opt/nvidia/deepstream/deepstream/sources/libs/nvdsinfer_customparser/nvdsinfer_custombboxparser.cpp .
More info can be found in
@Morganh Thanks for reply. And I have read the C++ program but not fully understand.Is there a python program for it? Thanks.Or is there a way to use the C++ library?I have installed the latest DeepStream SDK in my Jetson Nano,compile the library but I cannot find how to use the library.I run the DeepStream PeopleNet example but I cannot find the codes which parse the output.
There is not official python version of postprocessing. You can find some useful info above or the links I mentioned.
For c++ in deepstream, please directly run GitHub - NVIDIA-AI-IOT/deepstream_tao_apps: Sample apps to demonstrate how to deploy models trained with TAO on DeepStream
Thanks.
Peoplenet is actually detectnet_v2.
https://developer.nvidia.com/blog/training-custom-pretrained-models-using-tlt/