Is cublasHgemm pure half multiplication?

I am trying to find more information about cublasHgemm, but it seems that we do not have it anymore. What is the equivalent to cublasHgemm w.r.t cublasGemmEx?

cublasHgemm() still exists, see here.
cublasSgemmEx() can also handle half multiplication, see here you would select CUDA_R_16F for the matrix types, but the calculation is still done as float.
To emulate cublasHgemm() in cublasGemmEx() (see here) you would use COMPUTE_R_16F for compute type, and CUDA_R_16F for Scale Type, Atype, Btype, and Ctype.

1 Like

Is there a difference in their performance? I expect them to be the same, at least for multiplication and acummulation in half.

I wouldn’t expect a significant difference between cublasHgemm() and cublasGemmEx() using CUDA_R_16F, but I haven’t tested it.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.