How can I perform GEMM with INT8 in cuBLAS with DRIVE PX2

Jinwei · February 24, 2017, 12:39pm

CUDA version: 8.0.50
OS: Ubuntu 16.04
Hardware: DRIVE PX 2, on dGPU (GP106)

Call cublasGemmEx with M=N=K=lda=ldb=ldc=4096, alpha=1, beta=0 (both in int32_t on host), Atype=Btype=CUDA_R_8I, Ctype=computeType=CUDA_R_32I will always return CUBLAS_STATUS_NOT_SUPPORTED, no matter which algorithm I use (CUBLAS_GEMM_DFALT/ALGO0/1/2/3/4/5/6/7).

I noticed that CUDA 8 Performance Overview (released in november 2016, page 22) has benchmark for GEMM with INT8 on Tesla P40 and achieves 32TFLOPS throughtput.
cuBLAS’s main page (https://developer.nvidia.com/cublas, in Key Feature section) also said that cuBLAS supports integer (INT8) matrix multiplication operations.

The same code test passed on CUDA version: 8.0.61, Ubuntu 16.04 x86_64 and GTX 1080.
But DRIVE PX2 is still not work.

SteveNV · February 28, 2017, 12:01am

Hello Jinwei,

As far as you know, DrivePX2 has 2 GPU(internal and dGPU).
So could you please check your GPU first with ./deviceQuery commmand?
if you use dGPU, you should set GPU with “export CUDA_VISIBLE_DEVICES=0”(if internal “export CUDA_VISIBLE_DEVICES=1”) Thanks.

Jinwei · February 28, 2017, 3:28pm

Thank you for your reply.
I am sure that I am using GPU0 (i.e. dGPU, with CudaDeviceProp.name=GP106).
BTW, running on GPU1 (i.e. internal GPU, with CudaDeviceProp.name=GP10B) returns CUBLAS_STATUS_NOT_SUPPORTED too.

LitLeo · May 17, 2017, 10:00am

Hi, I met the same problem, did you solve the problem?

SteveNV · May 18, 2017, 2:11am

Hello LitLeo,

May I know your using DPX2 PDK version? Thanks.

LitLeo · May 18, 2017, 3:26am

My environment is Tesla P4，Centos 7.2，gcc 4.8，CUDA 8.0.
The error message:
** On entry to GEMM_EX parameter number 6 had an illegal value
** On entry to GEMM_EX parameter number 6 had an illegal value
Cublas failure
Error code CUBLAS_STATUS_NOT_SUPPORTED

SteveNV · May 18, 2017, 5:31am

Hello LitLeo,

Are you using DrivePX2 platform?
Maybe I think your question is not related to DrivePX2 platform.

Topic		Replies	Views
How can I perform GEMM with INT8 in cuBLAS CUDA Programming and Performance	3	2114	February 24, 2017
cublasGemmEX() INT-8 runtime error GPU-Accelerated Libraries cuda	7	1980	October 12, 2021
INT8 cublasGemmEx support on Tegra X2 and Tesla P100 GPU-Accelerated Libraries	4	1804	October 17, 2017
cublasGemmEx doesn't work with INT8 utilizing __dp4a instruction on NVIDIA 1080TI CUDA Programming and Performance	12	3639	September 25, 2017
cublasGemmEx execution error code CUBLAS_STATUS_ARCH_MISMATCH GPU-Accelerated Libraries	1	1464	January 7, 2020
GEMM returning CUBLAS_STATUS_EXECUTION_FAILED but with data correct CUDA Programming and Performance	0	2555	December 21, 2011
cuBLAS GEMM INT8 is much slower than FP16 in T4 GPU-Accelerated Libraries cublas	11	4264	November 2, 2023
cublasGemmEx cant use CUDA_R_8I compute type on GTX1080 GPU-Accelerated Libraries	4	1366	February 12, 2018
About cublasGemm INT8 support GPU-Accelerated Libraries	3	2683	September 15, 2017
cublasZgemm fails on FERMI but not on TESLA CUBLAS_STATUS_NOT_INITIALIZED even if 'cublasInit()& CUDA Programming and Performance	2	5906	February 17, 2011

How can I perform GEMM with INT8 in cuBLAS with DRIVE PX2

Related topics