List of kernels and what they stand for

eehtomit · May 28, 2019, 7:24pm

Hi,

first of all thanks for TensorRT it’s a great tool.

I built a custom model using Tensorflow that I converted to TensorRT. Now when I profile my application, I see statistics like:

Type  Time(%)      Time     Calls       Avg       Min       Max  Name
 GPU activities:   18.71%  355.70ms      1860  191.24us  101.19us  268.46us  volta_hcudnn_128x128_relu_small_nn_v1
                   18.69%  355.31ms      1500  236.87us  51.074us  861.37us  trt_volta_h884cudnn_256x128_ldg8_relu_exp_medium_nhwc_tn_v1
                    6.60%  125.42ms       540  232.25us  57.026us  1.0239ms  trt_volta_h884cudnn_256x128_ldg8_relu_exp_small_nhwc_tn_v1
...

But now I’m confused about the kernel names, I guess volta_hcudnn_128x128_relu_small_nn_v1 is computing relu on a 128x128 input but then what’s trt_volta_h884cudnn_256x128_ldg8_relu_exp_medium_nhwc_tn_v1.

Is there documentation about the kernels and what they are doing? My end goal is to profile my model to identify the bottlenecks and improve it but without a clear picture of what’s happening it’s hard.

Thanks!

Topic		Replies	Views
Regarding enabling turing specific kernels in cuDNN cuDNN	1	805	August 8, 2019
Nsight System profile tells volta_scudnn while using RTX 2080 Ti Profiling x86 Windows Targets	3	1155	October 12, 2021
Trtexec profile TensorRT	6	3157	October 12, 2021
GPU cuda cores or Tensor cores Jetson AGX Xavier cuda	2	948	October 18, 2021
TensorRT Naming specification Jetson AGX Xavier	1	375	August 16, 2019
Tensor RT optimization causes performance downgrade compared to onnx model TensorRT	4	891	January 26, 2022
what is the meaning of some terms when using `nvcr.io/nvidia/tensorflow:19.03-py2` to convert a tf saved_model? TensorRT	1	617	July 15, 2019
How to know what type of optimization have been done to my model when using trtexec TensorRT tensorrt , cudnn , deep-learning	2	302	March 20, 2024
Am I using Tensor Core? CUDA Programming and Performance	3	717	August 4, 2021
Tensorrt fp32 inference slower than pytorch on tesla T4 for groundingDINO TensorRT cudnn	1	565	January 22, 2024

List of kernels and what they stand for

Related topics