After reading Xavier inference benchmark (https://developer.nvidia.com/embedded/jetson-agx-xavier-dl-inference-benchmarks), I am little bit confused.
Specifacally, I don’t understand the meaning of the latency column: for example, in MAX-N mode for classifcation with ResNet-50 it reads that for a batch size of 128 the PREF is 1631 images per second, if I understand this correctly, it means that if I have a set of 100 batches where each batch is of size 128 X 224 X 224 X 3, the Xavier will process 1631/128 = 12.74 batches in 1 second and it will take about 100/12.74 = 7.84 seconds to process all my 100 batches, is that correct? If so, then what is the meaning of the latenct column which reads 78.5 milliseconds?