What is f16f16_f16f16_f16 in cudnn gemm kernel

137095576 · December 7, 2022, 2:29am

I run aconv3d, I see the kernel:
sm80_xmma_fprop_implicit_gemm_indexed_f16f16_f16f32_f32_nhwckrsc_nhwc_tilesize256x32x32_stage4_warpsize4x1x1_g1_tensor16x8x16_kernel_cudnn
but when I use tensorrt,the kernel is:
sm80_xmma_fprop_implicit_gemm_indexed_f16f16_f16f16_f16_nhwckrsc_nhwc_tilesize256x32x32_stage4_warpsize4x1x1_g1_tensor16x8x16
The different of two kernel is "f16f16_f16f16_f16 ", I want to know the five “fp16” means, and why it is different.

AakankshaS · December 26, 2022, 9:13am

Hi @137095576 ,
Apologies for delayed response,
Kindly reach out to TRT forum for the stated issue for better assistance.

Thanks

system · January 9, 2023, 9:14am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.