cublasGemmEx does not work on certain GPU

secretsh · August 22, 2017, 6:01am

I am worming with a codebase calling cublasGemmEx, it works well on a Titan GPU server, but doesn’t work on a K20 one.

Both servers installed cuda 8.0, cudnn 5.1, with GPU driver support cuda 8.0.

The codebase is written in C#, the target architecture of visual studio on K20 sever has set to k_20, sm_20. All other functions except cublasGemmEx work well. Functions calling cublasGemmEx get all zero return values.

secretsh · August 22, 2017, 8:21am

cudaDeviceSynchronize();
cudaError_t error = cudaGetLastError();
shows no error information

cbuchner1 · August 22, 2017, 8:44am

You might want to run this application under cuda-memcheck. This will catch CUDA API errors that occur internally, as well as out of illegal memory accesses inside the kernel.

tera · August 22, 2017, 11:40am

“k_20” is not a valid architecture, and “sm_20” is incorrect for the K20.

“compute_35” and “sm_35” are the correct values for the Tesla K20.

secretsh · August 23, 2017, 9:40am

Thank you for all your replies.

I found the in doc that "cublasCgemmEx is only supported for GPU with architecture capabilities equal or greater than 5.0 ", the error value should be CUBLAS_STATUS_ARCH_MISMATCH.