I am porting several numerical software solvers from massively-parallel and vector platforms to GPGPU. My experience is 10 years on Cray C90, YMP, and about 20 years of Cray T3D, T3E, RS and linux clusters. Thus, I am an numerical expert, not CG expert. Probably it is the main reason why I cannot get what I should do with GPUs.
Now I have Fedora 7 installed on Dual Xeon with 2 Gb main memory and GeForce 8800 GTX and millions of sources on massively parallel Fortran and C.
My goal is to reach 100 GFlop/s on my applications on one GTX, otherwise I have no reason to use GPUs - my algorithms work on one dual core with 4-5Gflop/s and if I switch to float instead of double, most of my algorithms require 2-3 times more computational work.
Please, suggest me a good User Manual for tuning a performance, however, Programming Guide V 1.0 seems to me not enough.
If there is no such book, can I post some questions about tuning here?