Hi,
Are CUBLAS 4.0 operations run asynchronously?
Specifically, if we run two cublasStrsm(param) function one after the other on different data, i.e.,
cublasStrsm(param);
cublasStrsm(param2);
Will they effectively increase the GPU occupancy by launching them asynchronously?
Thank you.
P.S.:
In CUBLAS 4.0,
void cublasStrsm (char side, char uplo, char transa,char diag, int m, int n, float alpha,const float *A, int lda, float *B, int ldb)