What is f16f16_f16f16_f16 in cudnn gemm kernel

I run aconv3d, I see the kernel:
sm80_xmma_fprop_implicit_gemm_indexed_f16f16_f16f32_f32_nhwckrsc_nhwc_tilesize256x32x32_stage4_warpsize4x1x1_g1_tensor16x8x16_kernel_cudnn
but when I use tensorrt,the kernel is:
sm80_xmma_fprop_implicit_gemm_indexed_f16f16_f16f16_f16_nhwckrsc_nhwc_tilesize256x32x32_stage4_warpsize4x1x1_g1_tensor16x8x16
The different of two kernel is "f16f16_f16f16_f16 ", I want to know the five “fp16” means, and why it is different.

Hi @137095576 ,
Apologies for delayed response,
Kindly reach out to TRT forum for the stated issue for better assistance.

Thanks

1 Like

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.