Hi to all!
I can’t run cublasStrsmBatched (line 113) without CUBLAS_STATUS_EXECUTION_FAILED (13) output. To simplify, all matrix values and alpha are 1.0, all matrices are square and lda, ldb, m and n are equal. I am able to run cublasSgemmBatched and cublasStrsm in the same way, with no error. cublasStrsmBatched should be the same, but it is not, not for me. Please tell me if you have any idea about what am I doing wrong over here:
Using Linux, CUDA 5.5, T10 and Windows, CUDA 5.5, GTX285