I successfully run multi-decoded batch inference using deepstream, and I want to test the related end-to-end performance, such as decoding latency, throughput, preprocessing latency, throughput, inference latency, etc. Are there more parameters and tools to use, such as NVDS_ENABLE_LATENCY_MEASUREMENT, NVDS_ENABLE_COMPONENT_LATENCY_MEASUREMENT and perf-measurement-interval-sec.
deepstream-app version 6.0.1
DeepStreamSDK 6.0.1
CUDA Driver Version: 11.4
CUDA Runtime Version: 11.4
TensorRT Version: 8.0
cuDNN Version: 8.2
libNVWarp360 Version: 2.0.1d3
os: ubuntu18.04
Driver Version: 470.63.01