Q1: If I set algo to CUBLAS_GEMM_DEFAULT, will it make use of cuda core or tensor core or both?
Q2: Although cublas manual gave some description, I still have some confusing, is there any detailed description about this parameter?
Q3: I noticed that there is a API called cublasSetMathMode, If I set it to pedantic compute modes, cublasGemmEx will only use CUDA cores?
CUBLAS_GEMM_DEFAULT
– Heuristics will try to pick the fastest based on problem parameters.- It is nothing more than a knob allowing you to choose the matmul implementation algorithm
CUBLAS_PEDANTIC_MATH
– This mode uses the prescribed precision and standardized arithmetic for all phases of calculations and is primarily intended for numerical robustness studies, testing, and debugging. This mode might not be as performant as the other modes.
- This doesn’t necessary mean SIMT kernels (CUDA cores) will be used.
- It has more to do with precision guarantees.