Please provide complete information as applicable to your setup.
• Hardware Platform (Jetson / GPU) RTX 2080
• DeepStream Version 5.0-dp
• TensorRT Version 18.104.22.168
• NVIDIA GPU Driver Version (valid for GPU only) 440.64.00
So far in my tests I haven’t been able to notice consistent gain or loss of performance from using
classifier_async_mode=1 for secondary GIEs.
deepstream-app example with
source4_1080p_dec_infer-resnet_tracker_sgie_tiled_display_int8.txt config, except I modified
[sink0] sections of the config as follows:
[source0] enable=1 #Type - 1=CameraV4L2 2=URI 3=MultiURI 4=RTSP type=2 uri=file://../../streams/sample_1080p_h264.mp4 num-sources=1 #drop-frame-interval=2 gpu-id=0 # (0): memtype_device - Memory type Device # (1): memtype_pinned - Memory type Host Pinned # (2): memtype_unified - Memory type Unified cudadec-memtype=0 [sink0] enable=1 #Type - 1=FakeSink 2=EglSink 3=File type=1 sync=0 source-id=0 gpu-id=0 nvbuf-memory-type=0
I change whether to use
classifier_async_mode directly in
config_infer_secondary_vehicletypes.txt config files.
Performance measurements are as reported by the sample app in terminal. Throughput remains very similar, sometimes increasing from 200FPS to 210FPS. Frame latency stays pretty much the same, while per component latency decreases for secondary GIEs and increases for
So, my question is what is the benefit of
classifier_async_mode? Can I expect performance gains from it, especially throughput, and in what scenarios (I understand that it needs track ids, but perhaps some other pipeline elements are also needed)?