Some question about Deep stream 5

• Hardware Platform (Jetson / GPU) : jetson nano
• DeepStream Version : 2.0 DP
• JetPack Version (valid for Jetson only) : 4.4 DP
• TensorRT Version : 7.1

1- How much deep stream can busy then jetson nano when lunch deep stream FaceDetection-IR?
2- Is it possible to deploy the FaceDetection-IR with deepstream sdk on multi-stream RTSP simultaneously? Is it easy work?
2- In the https://ngc.nvidia.com/catalog/models/nvidia:tlt_facedetectir, How do I run the facedetection on Deep stream, in the example this show peoplenet?

1- How much deep stream can busy then jetson nano when lunch deep stream FaceDetection-IR?
Normally, on NANO, inference will be bottleneck, you can refer to https://github.com/NVIDIA-AI-IOT/deepstream_tlt_apps#measure-the-inference-perf to measre the inference perf and check how many streams can fully utilize GPU.

2- Is it possible to deploy the FaceDetection-IR with deepstream sdk on multi-stream RTSP simultaneously? Is it easy work?

Yes, it’s easy work, you can refer to objectDetector_Yolo sample

2- In the https://ngc.nvidia.com/catalog/models/nvidia:tlt_facedetectir, How do I run the facedetection on Deep stream, in the example this show peoplenet?

you can refer to https://github.com/NVIDIA-AI-IOT/deepstream_tlt_apps

I saw a demo of jetson nano, that process 8 streams 1080p30 as the same time,
1- If demo prcoessed 8 stream 1080p as the same time, we need 8 model loaded and 8 frames passed to the models and process as the same time, but I want to know, How we can load 8 models together in 4 GB ram? Is it possible? I feel thay used one model and processes the frames of 8 streams one by one? and we also need amout of memory for resizing the 1080 to input size of models and CPU and GPU used share memory, right?

If demo prcoessed 8 stream 1080p as the same time, we need 8 model loaded and 8 frames passed to the models and process as the same time, but I want to know,

Now, it does not work like this.
As below diagram shows, 8 streams send their frames to nvstreammux, nnvstreammux could batch these frames as one batch,then send to nvinfer to do inference (each inference loop processes 8 frames).

stream#1 -> |
… | -> nvstreammux (batch the frames, e.g. one batch=8 frames) --> nvinfer (one model) -->
stream#8 -> |

Thanks.
1- How many models used in that demo for 8 streams, one model for all?
2- The model a the one time processed 1 frame, right? don’t use input size of (N>1,H,W,C) and processed at the same time?
3- In my opinion, If the the model proccessed one frame with 80 FPS then for 8 streams we can do 10 FPS for each model, right? If so, How to do 8 streams with 30 FPS? this is not online?
3-

8 streams send their frames to nvstreammux,

This is used multi-thread for this task?

4- Resized the frames are done on CPU or GPU?

1- How many models used in that demo for 8 streams, one model for all?

I don’t know which demo you refer to, so I don’t know how many models there is/are.
but I believe it’s 8 models to process 8 streams.

2- The model a the one time processed 1 frame, right? don’t use input size of (N>1,H,W,C) and processed at the same time?

As I mentioned above, each inference loop processes 8 frames (one batch).

don’t use input size of (N>1,H,W,C) and processed at the same time?

yesm it’s (N>1,H,W,C) , N is the batch number, e.g. 8 for 8 streams.

This is used multi-thread for this task?

No, one inference shot process 8 frames together.
I would recommend you to check TRT docs below, or seach “TensorRT batch”.

https://docs.nvidia.com/deeplearning/tensorrt/best-practices/index.html#batching
https://docs.nvidia.com/deeplearning/tensorrt/developer-guide/index.html

4- Resized the frames are done on CPU or GPU?

On dGPU platform, it’s on GPU.
On Jetson, it’s on VIC (video image convertor) + GPU. VIC is a hw accelerator in Jetson SOC

Thanks.
This demo.
The VIC has plugin? How to use this in python code? This is related to GSTreamer?

should be one model - DetectNetv2

The VIC has plugin?

It’s in nvinfer plugin.

In this demo one stream is for people detection and this is used Peoplenet(DetectNet2+resnet) and this network has 10 FPS on jetson nano, I don’t know how this process 1080, 30FPS?

So the jetson nano has a separate hardware for scaling the images? different from decoder hardware?

So the jetson nano has a separate hardware for scaling the images? different from decoder hardware?

yes, as mentioned above GPU and VIC can be used for scaling
decoder does not do scale

Because I connected the USB TPU to jetson nano, I need to captures the decoded and resized the streams passed into TPU, How I can to capture the frames of streams from decoder/VIC in array format?

I used the below codes for decoder and resacaling, the decoder used hardware by run sudo tegrastats I show NVDEC, for rescaling hardware, How I can check?

gstream_elemets = (
                'rtspsrc location=rtsp latency=300 !'
                'rtph264depay ! h264parse !'
                'queue max-size-buffers=100, leaky=2 !'
                'omxh264dec enable-max-performance=1 enable-low-outbuffer=1 !'
                'video/x-raw(memory:NVMM), format=(string)NV12 !'
                'nvvidconv ! video/x-raw , width=450, height=450, format=(string)BGRx !'
                'videorate ! video/x-raw, framerate=(fraction)10/1 !'
                'videoconvert ! '
                'appsink'). 
cv2.VideoCapture(gstream_elemets, cv2.CAP_GSTREAMER)

in my opinion then below element in above used VIC hardware, right?

‘nvvidconv ! video/x-raw , width=450, height=450, format=(string)BGRx !’

yes, but seems this is old setting since omxh264dec is deprecated and nvvidconv is replaced with nvvideoconvert for DeepStream.

And, for DeepStream, there is resize and convert in nvinfer for inference.

Thanks,
1- nvvideoconvert speed up than nvvidconv?
2- resizer in the nvinfer also use nvvideoconvert in deep stream?

nvvideoconvert supports more functions, you can find
nvvideoconvert doc in - https://docs.nvidia.com/metropolis/deepstream/plugin-manual/index.html#page/DeepStream%20Plugins%20Development%20Guide/deepstream_plugin_details.3.07.html#
nvvidconv doc in - https://developer.download.nvidia.com/embedded/L4T/r32_Release_v1.0/Docs/Accelerated_GStreamer_User_Guide.pdf?dS7IUl_-XGtI_RgFLtpxLpu0gHZUyDfG-HubiJqK2mfhL-GIm6IAErFQZFx0xEJU7fj8y_h6fodYwhjaRwHEX1Z315Kp7SCodOsyviFbsWeWFIadxOmxSdKk9Q9b319wEiZzHNp1RP7Gm9kwPMdo0GMyeiYsyrn9PlIp0D8qH-ayIMqdgw4

2- resizer in the nvinfer also use nvvideoconvert in deep stream?

No.

In the deep stream config sample, there is a interval options, what is that?
I have 8 stream 1080 and use detectnet_v2_resnet10 for detection, If I set the interval=0 and the process is became very slow and when I set interval=4, then the process is ok and fast, that mean I used every 4 frame for processing?

the part code of source8_1080p_dec_infer-resnet_tracker_tiled_display_fp16_nano.txt :

[primary-gie]
enable=1
gpu-id=0
model-engine-file=../../models/Primary_Detector_Nano/resnet10.caffemodel_b8_gpu0_fp16.engine
batch-size=8
#Required by the app for OSD, not a plugin property
bbox-border-color0=1;0;0;1
bbox-border-color1=0;1;1;1
bbox-border-color2=0;0;1;1
bbox-border-color3=0;1;0;1
interval=4
gie-unique-id=1
nvbuf-memory-type=0
config-file=config_infer_primary_nano.txt

please refer to the “interval” explanation in https://docs.nvidia.com/metropolis/deepstream/plugin-manual/index.html#page/DeepStream%20Plugins%20Development%20Guide/deepstream_plugin_details.3.01.html#wwpID0E0MDB0HA

Thanks,
that say interval is Specifies the number of consecutive batches to be skipped for inference
If we have 8 RTSP and we set batch-size=8 and interval=4, then we capture every 4 frame for each stream, right? If so, what’s difference between interval and drop-frame-rate in gstreamer option?

does this mean skipping 4 frames per stream then capturing one frame, skipping 4 frames per stream then capturing one frame,… ?

drop-frame-rate in gstreamer option

where is drop-frame-rate from?

interval = 4, does this mean skipping 4 frames per stream then capturing one frame, skipping 4 frames per stream then capturing one frame,… ?

drop-frame-interval :

[source0]
enable=1
#Type - 1=CameraV4L2 2=URI 3=MultiURI 4=RTSP
type=3
uri=file://../../streams/sample_1080p_h264.mp4
num-sources=8
#drop-frame-interval=2
gpu-id=0
# (0): memtype_device   - Memory type Device
# (1): memtype_pinned   - Memory type Host Pinned
# (2): memtype_unified  - Memory type Unified
cudadec-memtype=0

drop-frame-interval drops the frames from source, it affects all following components, e.g. display, encoding. For exmaple, with drop-frame-interval=4 for 30fps stream, the display can only receive 6 fps.
interval onlys affects the inference components,