TensorRT uses GPU alone or mix of CPU & GPU

Hello Experts,

CC: @Honey_Patouceul @DaneLLL @amycao @kayccc @icornejo.a @AastaLLL @dusty_nv @neuezeal @Jeffli

Curious to know that the tensorrt SDK will use GPU alone or mix of CPU + GPU to run the inference and other functionalities.

Not sure, but AFAIK it would mainly use GPU, and maybe DLA on Xavier as well.
Someone more skilled with this topic may better advise.

Hi,

As Honey_Patouceul said, TensorRT mainly use GPU for inference.
On Xavier, TensorRT also supports DLA inference.

Thanks.

Hi @AastaLLL

Is it possible to obtain the FLOPS or MIPS consumed by each layer of the model ?

Hi,

You can get the layer-level profiling data directly with our trtexec app with --dumpProfile flag.
For example, here is the output from YOLOv3 Tiny model:

 /usr/src/tensorrt/bin/trtexec [your/model/info] --dumpProfile
...
[11/04/2020-17:35:48] [I] === Profile (265 iterations ) ===
[11/04/2020-17:35:48] [I]                                              Layer   Time (ms)   Avg. Time (ms)   Time %
[11/04/2020-17:35:48] [I]                                             conv_1      201.47             0.76      6.4
[11/04/2020-17:35:48] [I]                                            leaky_1       76.22             0.29      2.4
[11/04/2020-17:35:48] [I]                                          maxpool_2       53.42             0.20      1.7
[11/04/2020-17:35:48] [I]                                             conv_3      131.26             0.50      4.2
[11/04/2020-17:35:48] [I]                                            leaky_3       38.95             0.15      1.2
[11/04/2020-17:35:48] [I]                                          maxpool_4       27.38             0.10      0.9
[11/04/2020-17:35:48] [I]                                             conv_5      113.38             0.43      3.6
[11/04/2020-17:35:48] [I]                                            leaky_5       20.45             0.08      0.6
[11/04/2020-17:35:48] [I]                                          maxpool_6       15.69             0.06      0.5
[11/04/2020-17:35:48] [I]                                             conv_7      123.34             0.47      3.9
[11/04/2020-17:35:48] [I]                                            leaky_7       11.17             0.04      0.4
[11/04/2020-17:35:48] [I]                                          maxpool_8        9.00             0.03      0.3
[11/04/2020-17:35:48] [I]                                             conv_9      133.43             0.50      4.2
[11/04/2020-17:35:48] [I]                                            leaky_9        6.77             0.03      0.2
[11/04/2020-17:35:48] [I]                                         maxpool_10        5.49             0.02      0.2
[11/04/2020-17:35:48] [I]                                            conv_11      139.17             0.53      4.4
[11/04/2020-17:35:48] [I]                                           leaky_11        4.22             0.02      0.1
[11/04/2020-17:35:48] [I]                                         maxpool_12        9.96             0.04      0.3
[11/04/2020-17:35:48] [I]                                            conv_13      497.09             1.88     15.8
[11/04/2020-17:35:48] [I]                                           leaky_13        6.63             0.03      0.2
[11/04/2020-17:35:48] [I]                                            conv_14      826.62             3.12     26.2
[11/04/2020-17:35:48] [I]                                           leaky_14        2.95             0.01      0.1
[11/04/2020-17:35:48] [I]                                            conv_19       15.74             0.06      0.5
[11/04/2020-17:35:48] [I]                                            conv_15      139.78             0.53      4.4
[11/04/2020-17:35:48] [I]                                         postMul_19        0.27             0.00      0.0
[11/04/2020-17:35:48] [I]                                           leaky_19        2.81             0.01      0.1
[11/04/2020-17:35:48] [I]                                          preMul_19        0.25             0.00      0.0
[11/04/2020-17:35:48] [I]                                             mm1_19       24.34             0.09      0.8
[11/04/2020-17:35:48] [I]                                             mm2_19        6.66             0.03      0.2
[11/04/2020-17:35:48] [I]  (Unnamed Layer* 42) [Matrix Multiply]_output copy        4.60             0.02      0.1
[11/04/2020-17:35:48] [I]                                           leaky_15        4.50             0.02      0.1
[11/04/2020-17:35:48] [I]                                            conv_16       42.02             0.16      1.3
[11/04/2020-17:35:48] [I]                                            yolo_17       10.26             0.04      0.3
[11/04/2020-17:35:48] [I]                                            conv_22      376.30             1.42     11.9
[11/04/2020-17:35:48] [I]                                           leaky_22        6.54             0.02      0.2
[11/04/2020-17:35:48] [I]                                            conv_23       45.43             0.17      1.4
[11/04/2020-17:35:48] [I]                                            yolo_24       22.44             0.08      0.7
[11/04/2020-17:35:48] [I]                                              Total     3156.02            11.91    100.0

Thanks.