Questions about using TensorRT - batch size

byyft2 · March 12, 2020, 8:17am

Hi.

I am doing some experiments using TensorRT.
I am doing as described here. ( https://devtalk.nvidia.com/default/topic/1050377/jetson-nano/deep-learning-inference-benchmarking-instructions/ )
Command example : ./trtexec --output=prob --deploy=…/data/googlenet/googlenet.prototxt --fp16 --batch=1
result : Average over 10 runs is 0.854739 ms (host walltime is 1.02564 ms, 99% percentile time is 0.871968) (It’s longer, but I’ll just write one line down.)

I understand that depending on GPU performance, if you raise the batch size to a moderate extent, it will speed up. So I raised the batch size.
Command example : ./trtexec --output=prob --deploy=…/data/googlenet/googlenet.prototxt --fp16 --batch=10
result : Average over 10 runs is 2.18824 ms (host walltime is 2.45414 ms, 99% percentile time is 2.21197).

The result showed that the ms increased. Does this result mean that 2.xxx ms came out when it ran batch size 10 times?
Is it right that it did it 10 times when the batch size was 1, and 100 times when the batch size was 10 times?
It’s a little confusing because of (Average over 10 runs is).