How can I perform GEMM with INT8 in cuBLAS with DRIVE PX2

CUDA version: 8.0.50
OS: Ubuntu 16.04
Hardware: DRIVE PX 2, on dGPU (GP106)

Call cublasGemmEx with M=N=K=lda=ldb=ldc=4096, alpha=1, beta=0 (both in int32_t on host), Atype=Btype=CUDA_R_8I, Ctype=computeType=CUDA_R_32I will always return CUBLAS_STATUS_NOT_SUPPORTED, no matter which algorithm I use (CUBLAS_GEMM_DFALT/ALGO0/1/2/3/4/5/6/7).

I noticed that CUDA 8 Performance Overview (released in november 2016, page 22) has benchmark for GEMM with INT8 on Tesla P40 and achieves 32TFLOPS throughtput.
cuBLAS’s main page (, in Key Feature section) also said that cuBLAS supports integer (INT8) matrix multiplication operations.

The same code test passed on CUDA version: 8.0.61, Ubuntu 16.04 x86_64 and GTX 1080.
But DRIVE PX2 is still not work.

Hello Jinwei,

As far as you know, DrivePX2 has 2 GPU(internal and dGPU).
So could you please check your GPU first with ./deviceQuery commmand?
if you use dGPU, you should set GPU with “export CUDA_VISIBLE_DEVICES=0”(if internal “export CUDA_VISIBLE_DEVICES=1”) Thanks.

Thank you for your reply.
I am sure that I am using GPU0 (i.e. dGPU, with
BTW, running on GPU1 (i.e. internal GPU, with returns CUBLAS_STATUS_NOT_SUPPORTED too.

Hi, I met the same problem, did you solve the problem?

Hello LitLeo,

May I know your using DPX2 PDK version? Thanks.

My environment is Tesla P4,Centos 7.2,gcc 4.8,CUDA 8.0.
The error message:
** On entry to GEMM_EX parameter number 6 had an illegal value
** On entry to GEMM_EX parameter number 6 had an illegal value
Cublas failure

Hello LitLeo,

Are you using DrivePX2 platform?
Maybe I think your question is not related to DrivePX2 platform.