Kernel function documentation?

r05943077 · July 19, 2021, 5:39am

I use nvprof to profile my tensorRT FP16 model inference.
with flags such as --single_precision_fu_utilization, we can know the resources kernel use.

For example:
the following table tells us the kernel function cuInt8::nhwc8Tonchhw2 use fp32 Core but no fp16 Core.
Kernel: void cuInt8::nhwc8Tonchhw2<int=32, int=16, int=2>(__half const , cuInt8::nhwc8Tonchhw2<int=32, int=16, int=2>, int, int, int, int, int, int)
6 tensor_precision_fu_utilization Tensor-Precision Function Unit Utilization Idle (0) Idle (0) Idle (0)
6 tensor_int_fu_utilization Tensor-Int Function Unit Utilization Idle (0) Idle (0) Idle (0)
6 single_precision_fu_utilization Single-Precision Function Unit Utilization Low (2) Low (3) Low (2)
6 half_precision_fu_utilization Half-Precision Function Unit Utilization Idle (0) Idle (0) Idle (0)

My questions are:
1.
I’m doing FP16 inference, so there should not be any int8 operators. However, cuInt8::nhwc8Tonchhw2 appears, and it seems to be int8 function.

Are there some documentation, where I can find what the kernel function (such as cuInt8::nhwc8Tonchhw2) are doing?

Thanks in advance!