Please provide complete information as applicable to your setup.
• Hardware Platform (Jetson / GPU) GPU (GTX1070) • DeepStream Version 5.0.1-20.09-triton • TensorRT Version 7.0.0 • NVIDIA GPU Driver Version (valid for GPU only) 460.32.03 • Issue Type( questions, new requirements, bugs) questions
During inference with the Deepstream-app I only reach 60 FPS with my TensorRT optimized Model (Mobilenetv2, 300,300) and have a GPU usage of 30%.
In my config file I changed under the part “[sink0]” the option “sync=1” to “sync=0” to get the full computing power. The FPS jumps from 30 to 60 FPS. But I discovered with the nvidia-smi tool that the GPU Utilization is just 30%.
When I change under the part “[sink0]” the option “type” from 2 (EglSink) to 1 (FakeSink) I get nearly 300FPS for the model and a GPU Utilization of 95-100%.
Thank you for the fast response @bcao.
No I don’t. I thought the latency measurement with the Latency Measurement API is just possible for live sources and RTSPStreaming and not for file sinks (option 3 for type in [sink0]).
See the following link Latency measurement issue - #3 by Amycao.
Or is it now possible to use NVDS_ENABLE_LATENCY_MEASUREMENT and NVDS_ENABLE_COMPONENT_LATENCY_MEASUREMENT also for File sinks?
Results for NVDS_ENABLE_COMPONENT_LATENCY_MEASUREMENT=1 and EglSink
BATCH-NUM = 4384**
Comp name = nvv4l2decoder0
component latency= 49.979004
Comp name = src_bin_muxer source_id = 0 pad_index = 0 frame_num = 4384 component_latency = 0.192871
Comp name = primary_gie
component latency= 55.223145
Comp name = tiled_display_tiler
component latency= 0.349121
Comp name = osd_conv
component latency= 1.545166
Comp name = nvosd0
component latency= 15.479004
Source id = 0 Frame_num = 4384 Frame latency = 134.345947 (ms)
Results for NVDS_ENABLE_COMPONENT_LATENCY_MEASUREMENT=1 and FakeSink
BATCH-NUM = 4384**
Comp name = nvv4l2decoder0
component latency= 8.173096
Comp name = src_bin_muxer source_id = 0 pad_index = 0 frame_num = 4384 component_latency = 2.907959
Comp name = primary_gie
component latency= 10.752197
Comp name = tiled_display_tiler
component latency= 0.229980
Comp name = osd_conv
component latency= 0.218018
Comp name = nvosd0
component latency= 1.289795
Source id = 0 Frame_num = 4384 Frame latency = 23.814941 (ms)
As it seems is the FPS difference due to the nvv4l2decoder, primary_gie and the nvosd0 component. To be honest I don’t know what to do next to reduce these numbers.