Hi all,
I’ve used trtexec to generate a TensorRT engine (.trt) from an ONNX model YOLOv3-Tiny (yolov3-tiny.onnx), with profiling i get a report of the TensorRT YOLOv3-Tiny layers (after fusing/eliminating layers, choosing best kernel’s tactics, adding reformatting layer etc…), so i want to calculate the TOPS (INT8) or the TFLOPS (FP16) of each layers to have the sum of the TOPS when i execute my neural network with TRT runtime.
Is there an approach to calculate the TOPS when running an neural network of each layers or the TOPS of the whole neural networks ?
PS : I know that AGX Xavier SoC can use both of GPU (512 CUDA Cores + 64 TensorCores) which can delivers 22 TOPS or 22 TFLOS and the DLA Cores but what i am trying to do is to calculate the TOPS when i run the neural network.
Command line i used :
/usr/src/tensorrt/bin/trtexec --onnx=yolov3-tiny-416-bs16.onnx --best --workspace=2048 --saveEngine=yolov3-tiny-416-bs16.trt --calib=calib_yolov3-tiny-int8-416.bin --verbose --dumpProfile
Board : Jetson AGX Xavier
TensorRT version : 7.1.3
cuDNN : 8.0
CUDA : 10.2
JetPack version : 4.5.1
[06/10/2021-16:44:53] [I] === Profile (196 iterations ) ===
[06/10/2021-16:44:53] [I] Layer Time (ms) Avg. Time (ms) Time %
[06/10/2021-16:44:53] [I] 001_convolutional input reformatter 0 76.11 0.39 2.6
[06/10/2021-16:44:53] [I] 001_convolutional 612.51 3.13 20.7
[06/10/2021-16:44:53] [I] 001_convolutional_lrelu 232.50 1.19 7.9
[06/10/2021-16:44:53] [I] 002_maxpool 128.33 0.65 4.3
[06/10/2021-16:44:53] [I] 003_convolutional 295.12 1.51 10.0
[06/10/2021-16:44:53] [I] 003_convolutional_lrelu 109.19 0.56 3.7
[06/10/2021-16:44:53] [I] 004_maxpool 68.56 0.35 2.3
[06/10/2021-16:44:53] [I] 005_convolutional 105.47 0.54 3.6
[06/10/2021-16:44:53] [I] 005_convolutional_lrelu 55.56 0.28 1.9
[06/10/2021-16:44:53] [I] 006_maxpool 36.03 0.18 1.2
[06/10/2021-16:44:53] [I] 007_convolutional 78.12 0.40 2.6
[06/10/2021-16:44:53] [I] 007_convolutional_lrelu 28.68 0.15 1.0
[06/10/2021-16:44:53] [I] 008_maxpool 19.72 0.10 0.7
[06/10/2021-16:44:53] [I] 009_convolutional 74.31 0.38 2.5
[06/10/2021-16:44:53] [I] 009_convolutional_lrelu 16.70 0.09 0.6
[06/10/2021-16:44:53] [I] 010_maxpool 10.84 0.06 0.4
[06/10/2021-16:44:53] [I] 011_convolutional 74.75 0.38 2.5
[06/10/2021-16:44:53] [I] 011_convolutional_lrelu 9.46 0.05 0.3
[06/10/2021-16:44:53] [I] 012_maxpool 15.74 0.08 0.5
[06/10/2021-16:44:53] [I] 013_convolutional 265.81 1.36 9.0
[06/10/2021-16:44:53] [I] 013_convolutional_lrelu 17.18 0.09 0.6
[06/10/2021-16:44:53] [I] 014_convolutional 20.95 0.11 0.7
[06/10/2021-16:44:53] [I] 014_convolutional_lrelu 5.78 0.03 0.2
[06/10/2021-16:44:53] [I] 019_convolutional 5.48 0.03 0.2
[06/10/2021-16:44:53] [I] 015_convolutional 75.62 0.39 2.6
[06/10/2021-16:44:53] [I] 019_convolutional_lrelu input reformatter 0 3.12 0.02 0.1
[06/10/2021-16:44:53] [I] 019_convolutional_lrelu 4.30 0.02 0.1
[06/10/2021-16:44:53] [I] 015_convolutional_lrelu 9.37 0.05 0.3
[06/10/2021-16:44:53] [I] 016_convolutional input reformatter 0 9.28 0.05 0.3
[06/10/2021-16:44:53] [I] 016_convolutional 26.17 0.13 0.9
[06/10/2021-16:44:53] [I] 016_convolutional output reformatter 0 11.55 0.06 0.4
[06/10/2021-16:44:53] [I] 020_upsample input reformatter 0 4.29 0.02 0.1
[06/10/2021-16:44:53] [I] 020_upsample 66.13 0.34 2.2
[06/10/2021-16:44:53] [I] 020_upsample copy 62.05 0.32 2.1
[06/10/2021-16:44:53] [I] 022_convolutional 196.06 1.00 6.6
[06/10/2021-16:44:53] [I] 022_convolutional_lrelu 15.59 0.08 0.5
[06/10/2021-16:44:53] [I] 023_convolutional input reformatter 0 17.04 0.09 0.6
[06/10/2021-16:44:53] [I] 023_convolutional 54.78 0.28 1.9
[06/10/2021-16:44:53] [I] 023_convolutional output reformatter 0 36.01 0.18 1.2
[06/10/2021-16:44:53] [I] Total 2954.25 15.07 100.0
[06/10/2021-16:44:53] [I]
&&&& PASSED TensorRT.trtexec # /usr/src/tensorrt/bin/trtexec --onnx=yolov3-tiny-416-bs16.onnx --best --workspace=2048 --saveEngine=yolov3-tiny-416-bs16.trt --calib=calib_yolov3-tiny-int8-416.bin --verbose --dumpProfile
Thank you !