Separating L and U easily from cusparse<t>csrilu0

I was hoping to implement ILU0 preconditioning with my own fGMRES algorithm using cuSPARSE and cuBLAS. I found the ILU0 and triangular solve functions in cuSPARSE, but it seems that cusparseDcsrsv_solve is missing some functionality present in the MKL library.

The mkl_dcsrtrsv function allows lower or upper to be specified as well as whether or not the diagonal is unitary. The L and U separation from ILU0 are done at the same time the triangular solver is called. If these options are not present on CUDA, then I need to perform the L and U separation prior to calling the triangular solve function in CUDA. If that’s the case, then it seems pointless to transfer the ILU0 to the host, separate, transfer back to the device and solve, as opposed to just computing ILU0 on the host and separating it prior to transferring it to the device.

Am I missing something (for example a function), or does it just make sense to perform the ILU0 and L/U separation on the host in this case.

Thanks.

I found an example in the CUSPARSE documentation that explains how to use the full ILU0 matrix in the two triangular solve phases. The Matrix Description property is used to tell the solver which values in the full ILU0 matrix to use and whether or not the diagonal is unitary.

Here’s a link to the example code (it’s right after the csrilu02_solve function definition): http://docs.nvidia.com/cuda/cusparse/#cusparse-lt-t-gt-csrilu02_solve