Want to know the relationship between the qps(queries per second) and the number of outputs

trivedi.nagaraj · December 6, 2023, 10:44am

Hi all, while inference a model with trtexec with the option --dumpOutput, I have this question.
The number of Queries Per Second it displayed is
[12/06/2023-05:06:32] [I] === Performance summary ===
[12/06/2023-05:06:32] [I] Throughput: 52.6221 qps

And the output it displayed is only once
[12/06/2023-05:06:32] [I] gpu_0/softmax_1: (1x1000)

I feel that there is no relationship between number of times the queries are inferenced and the number of outputs it has dumped.
I understand that qps tells about how many queries are made in 1 second unit of time. It means 52.26221 inferences are made in 1 second (correct me if I am wrong). But I have only one output prediction values.
I am of the opinion that we should get number of outputs equal to the number of queries made. Also how to check the accuracy of the inference. Attached is the inference output file for your reference. You can see at the end the output dumped is only once.
Please clarify all these doubts.
resnet50_dump_output.txt (29.5 KB)

AastaLLL · December 7, 2023, 2:17am

Hi,

trtexec only outputs the last iteration results:

$ /usr/src/tensorrt/bin/trtexec --help
&&&& RUNNING TensorRT.trtexec [TensorRT v8502] # /usr/src/tensorrt/bin/trtexec --help
...
=== Reporting Options ===
 ...
  --dumpOutput                Print the output tensor(s) of the last inference iteration (default = disabled)

Thanks,

trivedi.nagaraj · December 7, 2023, 6:07am

Hi, thank you for your quick response. But let me know how to verify that it has successfully inferred or not. Also let me know how to find accuracy of the model being inferred using trtexec.

Please provide me this information.

Thanks and Regards

N.M.Trivedi

AastaLLL · December 11, 2023, 6:38am

Hi,

You can check the sample below which can read OpenCV input and output the classification results:

https://elinux.org/Jetson/L4T/TRT_Customized_Example#OpenCV_with_PLAN_model

Thanks.

system · January 3, 2024, 4:49am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Why my inference time is so long when using trtexec - FP16? Jetson TX2 jetson-inference	4	1963	October 18, 2021
Inference Speed Jetson Xavier NX pytorch	6	892	April 12, 2023
Jetson AGX Xavier shows unstable inference time Jetson AGX Xavier tensorrt , jetson-inference	6	700	October 18, 2021
Inference time changes after training TensorRT tensorrt	5	583	September 25, 2020
Optimization using Inference batch size General Topics and Other SDKs	1	1019	January 19, 2022
Reducing the latency during inference in AGX Xavier Jetson AGX Xavier jetson-inference	3	379	October 18, 2021
Inference Time Scales Linearly With Batch Size Jetson AGX Xavier yolo	9	856	December 18, 2023
Performance discrepancy using TensorRT engines TensorRT tensorrt	3	661	October 5, 2021
Inconsistent TensorRT Inference Time on Jetson Xavier NX TensorRT	5	46	March 4, 2025
Extremely slow inference in TensorRT for live semantic segmentation model Jetson AGX Xavier tensorrt , tensorflow , jetson-inference	11	4394	April 12, 2022

Want to know the relationship between the qps(queries per second) and the number of outputs

Related topics