- Can you specify which app you are using?
→ I reference deepstream_python_apps/apps/deepstream-ssd-parser and using custom python code.
Since I use custom YOLOv3 ONNX model output tensor data via DeepStream(NvInfer), and during the process of post-process in Python.
[amycao]Deepstream running in asynchronous mode, so you not get the first three frames in the probe is expected. - When measuring performance, boost clock to make sure you get stable data.
sudo nvpmodel -m 0 //you can get model level from /etc/nvpmodel.conf
sudo jetson_clocks
→ I was tried, but same issue. - Did you mean you get big different performance for yolov3 on ds5 and ds6?
if yes, refer to this post, yolo perf low on ds6, there one fix, refer to comment 22
Deepstream 6 YOLO performance issue - Intelligent Video Analytics / DeepStream SDK - NVIDIA Developer Forums
→ I mean the same engine have a large rate difference within the nvinfer.
Although I get big different performance for yolov3 on ds5 and ds6, it was very unstable on ds5.
So can not compares ds5(always big gap) to ds6(somtimes big gap) currently.
[amycao]It may not be appropriate to calculate m_BackendContext->enqueueBuffer(backendBuffer,
*m_InferStream, m_InputConsumedEvent.get() run time as inference time. since infer running in different cudastream, it’s asynchomous. it may not finish inference. i think that’s why you get the infer time deviation. you should use trtexec to get the inference time. you can find it under /usr/src/tensorrt/bin/