I am interested in the contents of the link below.
I am looking for published performance data (latency in mili-seconds) for Jetson AGX Xavier ( with DLA - Deep Learning Accelerator ) inference processing with a VGG16 CNN network.
Specifically, layer-by-layer latency when executing inference with the VGG16 model, using the ImageNet dataset ( or other similar dataset ).
I am looking for latency data (start of inference processing by a layer to end of processing by the same layer) listed for each layer : for example
CONV1 layer …
Time and Avg.Time in the answer. What does time mean?
And why does FC Layer’s time take place for a long time?
If there is a code that I can test myself, can you share it with us?
I’m using the Jetsontx2 board.
We share the data since the user doesn’t have a Jeston board.
Since you have a TX2 device already, you can measure it directly.
This score is the performance profiling from TensorRT
Please add the
--dumpProfile flag to output the layer-level profiling data:
$ /usr/src/tensorrt/bin/trtexec --onnx=[your/model] --dumpProfile
Time indicates the total execution time for the 551 iterations.
The avg time is the average time for each iteration.
Ex. 0.02 ( Avg Time) = 10.92 ( Time) / 551 ( iterations)
Please note that TensorRT will automatically merge layers for better performance.
So you may find a fully-connected layer takes a long time since it also includes the operation from other layers.
This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.