List of kernels and what they stand for


first of all thanks for TensorRT it’s a great tool.

I built a custom model using Tensorflow that I converted to TensorRT. Now when I profile my application, I see statistics like:

Type  Time(%)      Time     Calls       Avg       Min       Max  Name
 GPU activities:   18.71%  355.70ms      1860  191.24us  101.19us  268.46us  volta_hcudnn_128x128_relu_small_nn_v1
                   18.69%  355.31ms      1500  236.87us  51.074us  861.37us  trt_volta_h884cudnn_256x128_ldg8_relu_exp_medium_nhwc_tn_v1
                    6.60%  125.42ms       540  232.25us  57.026us  1.0239ms  trt_volta_h884cudnn_256x128_ldg8_relu_exp_small_nhwc_tn_v1

But now I’m confused about the kernel names, I guess volta_hcudnn_128x128_relu_small_nn_v1 is computing relu on a 128x128 input but then what’s trt_volta_h884cudnn_256x128_ldg8_relu_exp_medium_nhwc_tn_v1.

Is there documentation about the kernels and what they are doing? My end goal is to profile my model to identify the bottlenecks and improve it but without a clear picture of what’s happening it’s hard.


