Inference performance measure

twangbarang · January 20, 2021, 2:23pm

• Hardware Platform: GPU
• DeepStream Version: 5.0.0
• TensorRT Version: 7.0.0.11
• NVIDIA GPU Driver Version (valid for GPU only): 460.32.03

Hi, just started using DS SDK and facing some questions right now:

Question: Using deepstream-app i set up a config based on the objectDetection_SSD example. How could I measure the time needed for an inference? Not sure about terminology here right now but what I mean is the time it took for my model being executed.

bcao · January 21, 2021, 6:13am

You can use trtexec to measure the model’s performance.
If you need to measure the Deepstream component latency, you can refer DeepStream SDK FAQ - #11 by bcao

twangbarang · January 21, 2021, 7:46am

That works great. Thanks. A follow up question for understanding: the primary gie component in my pipeline is in charge for doing the inference. I would like to ensure that this is actually being performed on the GPU. Component latency for primary gie from 8ms to 10ms is a sound argument for that i guess. However following the objectDetection_SSD example and it’s custom bounding box parser which i customized to suite my model I can see in the Makefile it’s being compiled with g++, not cuda as I would have expected. So question is if this is still being executed on the GPU and if there is any resource for this I can read up for better understanding as i expected only cuda-compiled code to be executed on the GPU or is this any nvinfer or tensorrt voodoo?

bcao · January 21, 2021, 8:06am

There is only a post process parser code under the dir, a typical process in gst_nvinfer pluin will be preprocess->inference->postprocess, the memory already copied to host memory from device memory when the postprocess executed. For inference, it will call TensorRT lib, so the inference must be executed on device.
You can refer Gst-nvinfer — DeepStream 6.3 Release documentation and Developer Guide :: NVIDIA Deep Learning TensorRT Documentation

twangbarang · January 28, 2021, 4:07pm

I see. Can you tell me where in the pipeline the bounding boxes are actually drawn to the frames and how they get forwarded to there?

bcao · February 2, 2021, 8:26am

OSD do the drawing based on the display meta data.

Topic		Replies	Views
Low GPU Utilization during inference DeepStream SDK gpu	4	1384	October 12, 2021
Inference time and model loading time DeepStream SDK tensorrt , ubuntu , gstreamer , python	2	183	June 24, 2024
Print inference time in deepstream 5.1 on TX2NX DeepStream SDK	11	1764	November 9, 2021
Inference performance of DS5.0 is lower than that of DS4.0? DeepStream SDK	14	817	October 12, 2021
Inference performance test of deepstream DeepStream SDK tensorrt , ubuntu	2	307	February 28, 2023
How to check inference time for a frame when using deepstream DeepStream SDK deepstream	9	85	September 5, 2024
Deepstream can run async mode? DeepStream SDK gstreamer	7	1974	March 8, 2021
The time it takes for the model to process each frame of the image DeepStream SDK deepstream	6	34	September 18, 2024
I run deepstream-test5 application and want to get the infer time of gst-infer DeepStream SDK deepstream	6	28	October 8, 2024
Why my inference time is so long when using trtexec - FP16? Jetson TX2 jetson-inference	4	1960	October 18, 2021

Inference performance measure

Related topics