DeepStream net-scale-factor on SGIE

Hello,

So I have a pipeline with the following format

pgie -> tracker -> sgie -> nvdosd -> nvvideoconvert 

I attached a probe function to the sink pad of nvvideoconvert. From my knowledge, the SGIE performs inference on the objects detected by the pgie.
Resizing is performed to resize the image to match the input size of the SGIE.

In this setting, the PGIE is an object detector.
The SGIE is a non-object detector (classifier, regression, etc.)

My questions are as follows:

  1. What kind of resizing technique / preprocessing is performed when passing the cropped images made from PGIE inference? If there are options, is there a way to control how the image is preprocessed / resized?

  2. When the SGIE receives an object as input, are the objects from the original un-preprocessed image buffers or are they image frames that undergo the y = net-scale-factor * (x - mean) specified in the PGIE config?

The reason I am asking this is because my SGIE is outputting very erroneous outputs. FYI, I am examining the tensor values directly via the probe function by setting output-tensor-meta=1 inside of the sgie config file.

  1. Is there a recommended way to visualize the SGIE input to ensure that the SGIE is inferencing on the correct object / image?

Environment

DeepStream Version: 5.1
TensorRT Version: 7.2.2.1
Nvidia Driver Version*: 460.32.03
CUDA Version: 11.1

Thank you!

Hi,

This looks like Deepstream related, moving to Deepstream forum to get better help.

Thank you.

Ah I apologize for the inconvenience. I was meaning to upload this post onto the DeepStream forum. Thank you for moving the post.

Hi,
Can you try running your model with trtexec command, and share the β€œβ€β€“verbose"" log in case if the issue persist
https://github.com/NVIDIA/TensorRT/tree/master/samples/opensource/trtexec

You can refer below link for all the supported operators list, in case any operator is not supported you need to create a custom plugin to support that operation

Also, request you to share your model and script if not shared already so that we can help you better.

Meanwhile, for some common errors and queries please refer to below link:
https://docs.nvidia.com/deeplearning/tensorrt/developer-guide/#error-messaging
https://docs.nvidia.com/deeplearning/tensorrt/developer-guide/#faq

Thanks!

Hi,

Thank you for the response. During the TensorRT engine generation, I did not run into any errors or issues regarding unsupported operations.

I am currently running DeepStream on the following docker image:

nvcr.io/nvidia/deepstream:5.1-21.02-devel

I will report back after running trtexecand obtaining the results.

In the meanwhile, I was wondering if I could get some answers to the following questions

  1. What kind of resizing technique / preprocessing is performed when passing the cropped images made from the objects detected by the PGIE? If there are options, is there a way to control how the image is preprocessed / resized?
  2. When the SGIE receives an object as input, are the objects from the original un-preprocessed image buffers or are they image frames that undergo the y = net-scale-factor * (x - mean) specified in the PGIE config?

The reason I am asking this is because my SGIE is outputting very erroneous outputs. FYI, I am examining the tensor values directly via the probe function by setting output-tensor-meta=1 inside of the sgie config file.

  1. Is there a recommended way to visualize the SGIE input to ensure that the SGIE is inferencing on the correct object / image?

Thank you for your time!

I have attached the model serialization script result with --verbose specified. The following command was used:

./trtexec --onnx=/root/attribute.onnx --batch=1 --saveEngine=output.trt --verbose

Log file can be found trt_engine_generation.txt (1.3 MB)

the DS pipeline should have nvstreammux before pgie as below

nvstreammux --> pgie -> tracker -> sgie -> nvdosd -> nvvideoconvert

the input buffers are cached in nvstreammux, by default, 4 frames per stream.
the input of sgie is crops from the buffer cached in nvstreammux with the BBOX generated by pgie.
there is not option to config the resize method for now.

you can refer to " 2. [DS5.0GA_Jetson_dGPU_Plugin] Dump the Inference Input" in DeepStream SDK FAQ - #9 by mchi to dump the input just before tensorrt enqueue() function in sgie.

the DS pipeline should have nvstreammux before pgie as below

nvstreammux β†’ pgie β†’ tracker β†’ sgie β†’ nvdosd β†’ nvvideoconvert

the input buffers are cached in nvstreammux, by default, 4 frames per stream.
the input of sgie is crops from the buffer cached in nvstreammux with the BBOX generated by pgie.
there is not option to config the resize method for now.

Thank you for the response. Sorry for the confusion, I omitted the nvstreammux at the front assuming that it was a given since a majority of pipelines constructed link PGIE to the nvstreammux.

the input of sgie is crops from the buffer cached in nvstreammux with the BBOX generated by pgie.

I use the following settings to transform image tensors with values [0 - 255] to [-1, 1]. Does this mean that SGIE receive the frames preprocessed (range: [-1, 1]) when fed to PGIE? Or the original input frame range [0 - 255]?

net-scale-factor=0.007843137255
offsets=127.5;127.5;127.5

you can refer to " 2. [DS5.0GA_Jetson_dGPU_Plugin] Dump the Inference Input " in DeepStream SDK FAQ - #9 by mchi to dump the input just before tensorrt enqueue() function in sgie.

Thank you for this piece of information. I will take a look at the thread in detail and report back.

No. The setting of a GIE does not affect the input to another GIE.
GIE alwasy crops the YUV (e.g NV12) data cached in nvstreammux.

No. The setting of a GIE does not affect the input to another GIE.
GIE alwasy crops the YUV (e.g NV12) data cached in nvstreammux.

Thank you very much for the clarification!

Regarding the link that you sent me, it was very helpful. I have a question:

I noticed that the zero-padding (black) is added when maintain-aspect-ratio=1 is set. I also noticed that the zero-padding is added to the bottom of the image.

Is there a way to ensure that the zero-padding is evenly distributed on the left, top, bottom and right of the image instead of being concentrated in the bottom?

this will be supported in DS6.0GA which will be released in following 1~2 weeks.
after padding, the image will in the center of the frame

1 Like

Thank you for the valuable information!

1 Like

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.