Hi all, while inference a model with trtexec with the option --dumpOutput, I have this question.
The number of Queries Per Second it displayed is
[12/06/2023-05:06:32] [I] === Performance summary ===
[12/06/2023-05:06:32] [I] Throughput: 52.6221 qps
And the output it displayed is only once
[12/06/2023-05:06:32] [I] gpu_0/softmax_1: (1x1000)
I feel that there is no relationship between number of times the queries are inferenced and the number of outputs it has dumped.
I understand that qps tells about how many queries are made in 1 second unit of time. It means 52.26221 inferences are made in 1 second (correct me if I am wrong). But I have only one output prediction values.
I am of the opinion that we should get number of outputs equal to the number of queries made. Also how to check the accuracy of the inference. Attached is the inference output file for your reference. You can see at the end the output dumped is only once.
Please clarify all these doubts.
resnet50_dump_output.txt (29.5 KB)