I am worming with a codebase calling cublasGemmEx, it works well on a Titan GPU server, but doesn’t work on a K20 one.
Both servers installed cuda 8.0, cudnn 5.1, with GPU driver support cuda 8.0.
The codebase is written in C#, the target architecture of visual studio on K20 sever has set to k_20, sm_20. All other functions except cublasGemmEx work well. Functions calling cublasGemmEx get all zero return values.