Performance data (latency) for VGG16 layer-by-layer inference on T4

niliev4 · May 16, 2021, 1:21pm

Hello,

I am looking for published performance data (latency in mili-seconds) for Tesla T4 inference processing with a VGG16 CNN network.
Specifically, layer-by-layer latency when executing inference with the VGG16 model, using the ImageNet dataset ( or other similar dataset ).

I am looking for latency data (start of inference processing by a T4 layer to end of processing by the same layer) listed for each layer : for example

CONV1 layer - x1 mili-sec
CONV2 layer - x2 mili-sec
…
Fully_connected FC8 layer - y_fc8 mili-sec
Fully_connected FC7 layer - y_fc7 mili-sec
Fully_connected FC6 layer - y_fc6 mili-sec

these are the layers I’m interested in. I have a VLSI hardware background and I’m familiar with (multi-cycle) hardware pipeline stages, with start/done processing flags per stage; these start/done flags allow for easy and accurate hardware latency measurements per stage. Intuitively, similar start/done flags for each CNN layer can be used to profile inferencing latency per layer, Perhaps the T4 has such start/done flags and they have been used by software applications to extract layer-by-layer inference latency ?

I’m aware of these benchmarks :

Edge TPU performance benchmarks | Coral

for a VGG16 model, but they list inference processing latency for the entire VGG16 model, and don’t have a layer-by-layer breakdown of the processing latency.

thank you,
Nick Iliev, Ph.D.
Research Associate
ECE AEON lab
UIC

Topic		Replies	Views
Latency for VGG16 inference Jetson TX2 jetson-inference	2	687	January 5, 2022
Performance data (latency) for VGG16 layer-by-layer inference Jetson AGX Xavier jetson-inference	9	1831	September 5, 2021
the latency of int8 mode in T4 is very slow TensorRT	3	983	October 1, 2019
Deep Learning Inference: Performance validation on TX1 Jetson TX1	16	15177	November 2, 2021
TensorRT Latency measurement Jetson TX1	2	901	October 18, 2021
TRT4.0 at 1080TI vs TITAN V TensorRT	6	1345	September 12, 2018
Understanding latency in inference Xavier benchmarks Jetson AGX Xavier	2	604	October 18, 2021
RAM Perfomance TegraX2 T186 on Feature Extraction Jetson TX2	2	452	October 18, 2021
VGG Performacnce on Tensorrt TensorRT	2	1176	January 7, 2019
How to understand Inference performance benchmarks CUDA Programming and Performance	1	526	October 8, 2018

Performance data (latency) for VGG16 layer-by-layer inference on T4

Related topics