Perform forward and backward substitution after Cholesky factorizing (Solving AX=B system)

Hi,

If we have AX=B system (A,X,B all matrices) and A is symmetric and positive definite, we can use Cholesky factorizing to solve this problem.

Then the problem becomes, LL’X = B.

Then we need to perform forward and backward substitution, s.t.:
LY = B
L’X = Y

Forward and backward substitution is also an expensive part of this method. In GPUs, what is the best way to perform these forward and backward substitution algorithms?

Is there any CUDA example available for this case?

Thank you.

You could use the TRSM routines (STRSM in single precision, DTRSM in double) available in CUBLAS.