Hello there,

First of all, let me point out im relatively new to the world of GPU programming but still

This is a SpMV for CSR as it is in the programming guide.

**global** void SpMV(

const float * csrNz_d, const int * csrCols_d,

const int * csrRowStart_d, const float * x_d, float * y_d,

const int num_rows

)

{

int row = blockIdx.x * blockDim.x + threadIdx.x;

if(row < num_rows ){

float dot = 0; //or float ?!?!

int row_start = csrRowStart_d[ row];

int row_end = csrRowStart_d[ row +1];

for (int jj = row_start ; jj < row_end ; jj ++)

dot += csrNz_d[jj] * x_d[ csrCols_d[jj ]];

y_d[ row ] += dot;

}

}

This is part of the main () :

int block_size = 512;

int n_blocks = dim.M/block_size + (dim.M%block_size == 0 ? 0:1);

SpMV <<< n_blocks, block_size >>> (csrNz_d, csrCols_d, csrRowStart_d, x_d, y_d, dim.M);

suing the terms in the guide that would be:

SpMV <<< n_blocks, block_size >>> (data, indices, ptr, x, y, num_rows);

Well, this seems to be perfectly working for relatively small matrices of 100 000 elements and sizes of around 50 000 x 50 000.

However, when I input a bigger matrix, e.g. 480 000 x 171 000 with approx 6 million non-zero elements, the returned vector is all zeros. I have tried a lot of different matrices, it only works for the smaller ones. I have placed error catching statements after each device statement, however, it does not report anything. It simply returns my y_d vector of all 0 elements.

I’m using a 8600GT M GPU.

Any suggestions why this could be happening ?

Cheers,