Hello,
I want to use LU decomposition with cuSPARE with a sparse, very big (300,000x300,000) matrix.
I can do this with scipy or other LU decomposition libraries but in the example implementations I have found with cuSPARSE, I get errors.
cusparseScsrilu02_analysis(handle, N, nnz, descrA, d_A, d_A_RowIndices, d_A_ColIndices, info_A, solvePolicy1, pBuffer);
cusparseScsrilu02(handle, N, nnz, descrA, d_A, d_A_RowIndices, d_A_ColIndices, info_A, solutionPolicy, pBuffer);
cusparseStatus_t status = cusparseXcsrilu02_zeroPivot(handle, info_A, &numerical_zero);
if (CUSPARSE_STATUS_ZERO_PIVOT == status){ printf("U(%d,%d) is zero\n", numerical_zero, numerical_zero); }
float *d_z; cudaMalloc(&d_z, N * sizeof(float));
const float alpha = 1.;
cusparseScsrsv2_solve(handle, CUSPARSE_OPERATION_NON_TRANSPOSE, N, nnz, &alpha, descr_L, d_A, d_A_RowIndices, d_A_ColIndices, info_L, d_x, d_z, CUSPARSE_SOLVE_POLICY_NO_LEVEL, pBuffer);
/*********************/
/* STEP 5: U * y = z */
/*********************/
cusparseScsrsv2_solve(handle, CUSPARSE_OPERATION_NON_TRANSPOSE, N, nnz, &alpha, descr_U, d_A, d_A_RowIndices, d_A_ColIndices, info_U, d_z, d_y, CUSPARSE_SOLVE_POLICY_USE_LEVEL, pBuffer);
So, I get a
CUSPARSE_STATUS_ZERO_PIVOT
, and get NAN values for the array which I find via back-substitution.
If I use this same matrix using Scipy
from scipy.sparse.linalg import splu
LU = splu(lap)
x = LU.solve(b)
or indeed using CUDA CUSP libraries (iterative method like BiCGStab) I get a good answer. However, think LU decomposition may be faster, so I wish to use this in CUDA.
Does anyone know the limitations of the example LU decomposition solver described above (and taken from CUDA examples on this website).