About cublasGemm INT8 support

anon37147145 · March 14, 2017, 7:57pm

Hi,

Several high-level resources about cuBLAS mention the support of INT8 matrix multiplication (in this cuBLAS introduction, this blog post or this one).

However, after looking at the online documentation and doing some actual experiments on a Titan X Pascal, it is unclear to me whether cublasGemm supports INT8 as the computation precision or not.

The closest I can find is the cublasGemmEx function that supports INT8 data as inputs but does the computation with half float at minimum.

Is the documentation not up-to-date or am I missing something?

Thanks,

Guillaume

Robert_Crovella · March 14, 2017, 9:05pm

I think there may be some gaps in the documentation, but it appears that a CUDA_R_8I, CUDA_R_8I, CUDA_R_32I combination is supported for cublasGemmEx, as described like this in the documentation you linked:

“For CUDA_R_32I computation type the matrix types combinations supported by cublasGemmEx are listed below. This path is only supported with alpha, beta being either 1 or 0; A, B being 32-bit aligned; and lda, ldb being multiples of 4.”

and I would expect this to take advantage of the dp4a instruction which is at the heart of int8 acceleration available in cc 6.1 GPUs.

anon37147145 · March 30, 2017, 8:21am

Indeed, it seems that this configuration is the so called INT8 GEMM.

Thanks!

adit_bhrgv · September 15, 2017, 11:54am

Hello,

Does cublasGemmEx() supports unsigned INT8 multiplications ?
For this combo, CUDA_R_8I, CUDA_R_8I, CUDA_R_32I , only signed INT8 values are supported…

How to execute unsigned INT8 ?

Thanks

Topic		Replies	Views
cublasGemmEx doesn't work with INT8 utilizing __dp4a instruction on NVIDIA 1080TI CUDA Programming and Performance	12	3642	September 25, 2017
How can I perform GEMM with INT8 in cuBLAS CUDA Programming and Performance	3	2114	February 24, 2017
cublasGemmEx() doesn't support 8-bit unsigned integer multiplications!!! CUDA Programming and Performance	0	484	September 15, 2017
INT8 cublasGemmEx support on Tegra X2 and Tesla P100 GPU-Accelerated Libraries	4	1807	October 17, 2017
cublasGemmEx cant use CUDA_R_8I compute type on GTX1080 GPU-Accelerated Libraries	4	1366	February 12, 2018
How can I perform GEMM with INT8 in cuBLAS with DRIVE PX2 General	6	2180	May 18, 2017
cuBLAS GEMM INT8 is much slower than FP16 in T4 GPU-Accelerated Libraries cublas	11	4290	November 2, 2023
CuBLAS: cublasGemmBatchedEx / cublasGemmStridedBatchedEx support for DP4A GPU-Accelerated Libraries	0	1355	October 14, 2018
cuBLAS INT8 tensor core mode vs. FP16 mode GPU-Accelerated Libraries	0	885	February 15, 2019
Int8 ouptut bug in CUBLAS-LT? GPU-Accelerated Libraries mixed-precision	0	664	January 14, 2021

About cublasGemm INT8 support

Related topics