How to perform GEMM using CUDNN?

I know CUDNN supports the convolution using GEMM (data rearrange needed though), but is there any way to perform GEMM directly using CUDNN?
Of course, there are cuBLAS or cuTLASS for GEMM, but still want to perform GEMM using CUDNN API.

Hi @MangoDalDalRaccoon ,
Do you mean to perform matrix multiplication using cuDNN ?
cuDNN will call GEMM kernel but that is to do convolution as you already mentioned.
Can you please elaborate on the ask.

Thanks!

Yes thanks for replying.
I mean just gemm. For example, multiply two matrix with shape
a = 2x3
b = 4x3
with matrix b transposed.
I understand that cudnnConvFwd is the wrapper of the optimized gemm function, and I wonder how to call the gemm function directly for my matrix multiplication?

Hi @MangoDalDalRaccoon ,
There are a few options which you may try here:

  • Use a matmul op using the backend API (see the backend API docs for more details)
  • Use a matmul op using the cudnn cpp frontend (see run_matmul_bias_gelu in fusion_sample.cpp for an example.
  • Transform the GEMM problem into a 1x1 convolution, and call cudnn convolution, either through the legacy cudnn API or the backend or frontend API.

Thanks!

Hi @AakankshaS,

I am confused about how to write a matmul using cudnn too.

  1. I cannot find any code samples about how to use the backend API to write a matmul. I felt that it is difficult to put descriptors together according to the documents. So could you please provide some materials about using the backend api to write a matmul-like operator?

  2. I write a matmul with the frontend api but I got the message: “Fusion with float inputs is only supported on Ampere or later”. I want to run these codes on the v100 GPU. Is there a way to solve the problem?

Generally, all I want is running a matmul with cudnn on v100. But I have problems on both the frontend and backend api. So I am looking forward to some help.

Thanks!!!