When enable latency measurement, the result seems has some problem

Please provide complete information as applicable to your setup.

• Hardware Platform (Jetson / GPU) GPU
• DeepStream Version 6.1
• JetPack Version (valid for Jetson only)
• TensorRT Version 8.2.5-1+cuda11.4
• NVIDIA GPU Driver Version (valid for GPU only) 510.68.02
• Issue Type( questions, new requirements, bugs) questions
• How to reproduce the issue ? (This is for bugs. Including which sample app is using, the configuration files content, the command line used and other details for reproducing) export NVDS_ENABLE_LATENCY_MEASUREMENT=1
• Requirement details( This is for new requirement. Including the module name-for which plugin or for which sample application, the function description)

Hi, recently i found a question when i run deepstream-app program in the nvcr.io/nvidia/deepstream:6.1-devel container,
when i export NVDS_ENABLE_LATENCY_MEASUREMENT=1,
the results seems strange,here is the screenshot:
image

batch-num = 148 and batch-num=149 deals the same data, i choose 1 ipc and 3 files with file-loop for test, set streammux batch-size to 4, set yolov5 batch-size to 4 ,why come to this results?
here is my txt:
[application]
enable-perf-measurement=1
perf-measurement-interval-sec=5
[source0]
enable=1
#Type - 1=CameraV4L2 2=URI 3=MultiURI 4=RTSP 5=CSI
type=4
#uri=file:/opt/nvidia/deepstream/deepstream-6.1/samples/streams/sample_1080p_h264.mp4
uri=rtsp://admin:zhy12345@172.16.72.64:554/Streaming/Channels/101
#uri=rtmp://192.168.0.14:1935/live/live999
num-sources=1
gpu-id=0

(0): memtype_device - Memory type Device

(1): memtype_pinned - Memory type Host Pinned

(2): memtype_unified - Memory type Unified

cudadec-memtype=0

[source1]
enable=1
type=3
uri=file:/opt/nvidia/deepstream/deepstream-6.1/samples/streams/sample_qHD.mp4
#uri=rtmp://192.168.0.14:1935/live/live999
#uri=rtsp://172.16.80.161:554/live/main_stream
num-sources=1
gpu-id=0
cudadec-memtype=0

[source2]
enable=1
#Type - 1=CameraV4L2 2=URI 3=MultiURI 4=RTSP 5=CSI
type=3
uri=file:/opt/nvidia/deepstream/deepstream-6.1/samples/streams/sample_ride_bike.mov
#uri=rtmp://192.168.0.14:1935/live/live999
#uri=rtsp://172.16.80.161:554/live/main_stream
num-sources=1
gpu-id=0

(0): memtype_device - Memory type Device

(1): memtype_pinned - Memory type Host Pinned

(2): memtype_unified - Memory type Unified

cudadec-memtype=0

[source3]
enable=1
#Type - 1=CameraV4L2 2=URI 3=MultiURI 4=RTSP 5=CSI
type=3
uri=file:/opt/nvidia/deepstream/deepstream-6.1/samples/streams/sample_1080p_h265.mp4
#uri=rtmp://192.168.0.14:1935/live/live999
num-sources=1
gpu-id=0

(0): memtype_device - Memory type Device

(1): memtype_pinned - Memory type Host Pinned

(2): memtype_unified - Memory type Unified

cudadec-memtype=0

[sink0]
enable=1
#Type - 1=FakeSink 2=EglSink 3=File 4=RTSPStreaming 5=Overlay
type=1
source-id=0

Indicates how fast the stream is to be rendered. 0: As fast as possible 1: Synchronously

sync=0
gpu-id=0
nvbuf-memory-type=0
codec=1
enc-type=0
qos=0
bitrate=4000000
iframeinterval=30
rtsp-port=8857
udp-port=5400

[sink1]
enable=1
type=1
source-id=1
sync=0
gpu-id=0
nvbuf-memory-type=0
codec=1
enc-type=0
qos=0
bitrate=4000000
iframeinterval=30
rtsp-port=8858
udp-port=5401
[sink2]
enable=1
type=1
source-id=2
sync=0
gpu-id=0
nvbuf-memory-type=0
codec=1
enc-type=0
qos=0
bitrate=4000000
iframeinterval=30
rtsp-port=8859
udp-port=5402

[sink3]
enable=1
type=1
source-id=3
sync=0
gpu-id=0
nvbuf-memory-type=0
codec=1
enc-type=1
qos=0
bitrate=4000000
iframeinterval=30
rtsp-port=8860
udp-port=5403

[osd]
enable=1
gpu-id=0
border-width=5
text-size=15
text-color=1;1;1;1;
text-bg-color=0.3;0.3;0.3;1
font=Serif
show-clock=0
clock-x-offset=800
clock-y-offset=820
clock-text-size=12
clock-color=1;0;0;0
nvbuf-memory-type=0

[streammux]
gpu-id=0
##Boolean property to inform muxer that sources are live
live-source=1

根据路数进行设置

batch-size=4
##time out in usec, to wait after the first buffer is available
##to push the batch even if the complete batch is not formed
batched-push-timeout=40000

Set muxer output width and height

width=1920
height=1080
enable-padding=0
nvbuf-memory-type=0

[primary-gie]
enable=1
gpu-id=0
batch-size=4
gie-unique-id=1
nvbuf-memory-type=0
config-file=config_infer_primary_yoloV5.txt

[tests]
file-loop=1

There is another strange result, when i choost one rtsp stream , and set streammux batch-size =4 , batched-push-timeout=40000, yolov5 infer-eigine batch-size = 4, but in one batch , a frame latency is less than 40ms, but another longer than 40ms, does this mean the gpu infer is serial although i set the infer-eigine batch-size=4? how can i infer the data parallelly?
URL_ef0daf9efc9b1dd13b2bca64ca90f416

I will try.

please set bat-size =1 if only one source.

GPU will wait a batch data, then process them parallelly.

uh…

  1. what can i do if i want to batch process data but only have a stream ? If i set infer-eigine batch-size=4, it means gpu will batch data automatically with streammux batch-size=1 ?

using the similar configuration file, I can’t reproduce this issue, could your share your whole terminal logs and config_infer_primary_yoloV5.txt? here is my test report.
log.txt (6.7 MB)
cfg.txt (6.6 KB)

i found the difference between the cfg files,
i enable a few sinks, u only enable one sink, when i enable one sink, the result seem correct.


source0 is a rtsp stream.
The others are file stream. I set file-loop=1 . It seems the value strange.

here is the reason:
tiled-display will be default value 0 because it is not be set, the pipeline will include nvstreamdemux and four fakesinks, there will be four probe functions latency_measurement_buf_prob on the four sinks( you can debug in create_pipeline()). after inferencing every frame, usermeta will be copied in four copies by nvstreamdemux , and latency_measurement_buf_prob will be entered four times, so there will be the multiple same printings because the usermeta is the same, batch_num is a global ,variant, it will changed every time.

thanks, i will figure the details follow your guide

Is this still an issue to support? Thanks

yeah, thanks for your support.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.