I was testing on the CUDA 10.2 example cuSolverSp_LowlevelCholesky (\CUDA Samples\v10.2\7_CUDALibraries\cuSolverSp_LowlevelCholesky) on GeForce GTX Titan X with 12288MB memory on Windows 10. I tried visual studio 2013 and 2019 and had the same observations.

With the lap2D_5pt_n100.mtx that comes with the example (10000x10000, 29800 non-zero items, file size about 402kB), the examples runs well with the following output.

GPU Device 0: “Maxwell” with compute capability 5.2

Using default input file [./lap2D_5pt_n100.mtx]

step 1: read matrix market format

sparse matrix A is 10000 x 10000 with 49600 nonzeros, base=1

step 2: create opaque info structure

step 3: analyze chol(A) to know structure of L

step 4: workspace for chol(A)

step 5: compute A = L*L^T
step 6: check if the matrix is singular
step 7: solve A*x = b

step 8: evaluate residual r = b - A

*x (result on CPU)*

(CPU) |b - Ax| = 3.637979E-012

(CPU) |b - A

(CPU) |A| = 8.000000E+000

(CPU) |x| = 7.513384E+002

(CPU) |b - A

*x|/(|A|*|x|) = 6.052497E-016

step 9: create opaque info structure

step 10: analyze chol(A) to know structure of L

step 11: workspace for chol(A)

step 12: compute A = L

*L^T*

step 13: check if the matrix is singular

step 14: solve Ax = b

step 13: check if the matrix is singular

step 14: solve A

(GPU) |b - A

*x| = 1.364242E-012*

(GPU) |b - Ax|/(|A|*|x|) = 2.269686E-016

(GPU) |b - A

However, if I run the same executable with two other matrices: Matrix 1 (138507x138507, 6155289 non-zero items, file size about 138MB), Matrix 2 (109872x109872, 2311992 non-zero items, file size about 42MB), the program will terminate with CUSOLVER_STATUS_INTERNAL_ERROR.

The matrix files are available at the following link for your reference.

https://www.dropbox.com/s/2ejxjwndr9mlt05/matrix1.zip?dl=0

https://www.dropbox.com/s/sv881bb9wawb71w/matrix2.zip?dl=0

The following output is for Matrix 1 and the one for Matrix 2 is similar.

GPU Device 0: “Maxwell” with compute capability 5.2

Using default input file [./lap2D_5pt_n100.mtx]

step 1: read matrix market format

sparse matrix A is 138507 x 138507 with 6155289 nonzeros, base=0

step 2: create opaque info structure

step 3: analyze chol(A) to know structure of L

step 4: workspace for chol(A)

step 5: compute A = L*L^T
step 6: check if the matrix is singular
step 7: solve A*x = b

step 8: evaluate residual r = b - A

*x (result on CPU)*

(CPU) |b - Ax| = 4.176400E-009

(CPU) |b - A

(CPU) |A| = 2.373223E+015

(CPU) |x| = 5.219332E-008

(CPU) |b - A

*x|/(|A|*|x|) = 3.371697E-017

step 9: create opaque info structure

step 10: analyze chol(A) to know structure of L

step 11: workspace for chol(A)

step 12: compute A = L*L^T

step 13: check if the matrix is singular

CUDA error at cuSolverSp_LowlevelCholesky.cpp:327 code=7(CUSOLVER_STATUS_INTERNAL_ERROR) “cusolverSpDcsrcholZeroPivot( cusolverSpH, d_info, tol, &singularity)”

Could you please advise how to debug and resolve this error?