I have an application which could benefit from acceleration of a sparse linear solver.

I compiled and ran the code sample to see what kind of performance benefit can be had, and to my surprise, the code sample reported that the CPU was faster than the GPU.

Is this typical, or am I reading something wrong?

GPU Device 0: “GeForce GTX 1080” with compute capability 6.1

Using default input file [./lap2D_5pt_n100.mtx]

step 1: read matrix market format

sparse matrix A is 10000 x 10000 with 49600 nonzeros, base=1

step 2: reorder the matrix A to minimize zero fill-in

if the user choose a reordering by -P=symrcm or -P=symamd

The reordering will overwrite A such that

A := A(Q,Q) where Q = symrcm(A) or Q = symamd(A)

step 2.1: set right hand side vector (b) to 1

step 3: prepare data on device

step 4: solve A*x = b on CPU
step 5: evaluate residual r = b - A*x (result on CPU)

(CPU) |b - A

*x| = 4.547474E-12*

(CPU) |A| = 8.000000E+00

(CPU) |x| = 7.513384E+02

(CPU) |b - Ax|/(|A|

(CPU) |A| = 8.000000E+00

(CPU) |x| = 7.513384E+02

(CPU) |b - A

*|x|) = 7.565621E-16*

step 6: solve Ax = b on GPU

step 6: solve A

step 7: evaluate residual r = b - A

*x (result on GPU)*

(GPU) |b - Ax| = 1.818989E-12

(GPU) |b - A

(GPU) |A| = 8.000000E+00

(GPU) |x| = 7.513384E+02

(GPU) |b - A

*x|/(|A|*|x|) = 3.026248E-16

timing chol: CPU = 0.075586 sec , GPU = 0.094802 sec