Hello,
I had run the sdk examples both on a gtx 280 and a brand new gtx 480.
For example, with the eigenvalues problem, i’ve got:
21.813000 ms on the gtx 280
and
40.384 ms on the gtx 480
similarly,
on matrixMul I obtain better perfornaces with the gtx 280. (211GFlops and 207GFlops)
Do someone else notice the same problem ?
Changing grid and block sizes didn’t seem to resolve it, is there a way to optimize it, or is there a need to change the code deeper
Thanks
Here are my GTX470 eigenvalue results from the GTX470/GTX480 benchmarks thread:
[root@chicadee release]# ./eigenvalues
[ CUDA eigenvalues ]
Matrix size: 2048 x 2048
Precision: 0.000010
Iterations to be timed: 100
Result filename: 'eigenvalues.dat'
Gerschgorin interval: -2.894310 / 2.923303
Average time step 1: 9.024376 ms
Average time step 2, one intervals: 3.029389 ms
Average time step 2, mult intervals: 0.005510 ms
Average time TOTAL: 12.074430 ms
Thanks for the answer.
I didn’t saw your previous thread.
Well, I got something wrong with the card. I have done the test under Windows and I will retest it under linux (as soon as it we’ll be installed).
Anyway, thanks. I am pleased to hear that I am the only problem ;)