Does anyone have experiences with rewriting an OpenGL + Cg code running on G7X cards to CUDA for the G80? How much performance gain is to be expected?
Little bit more precisely: my application is a memory limited sparse matrix multiplication that runs at 10 Gflops on G7X using OpenGL + Cg and it runs at around 30 Gflops on the G80 with the same code (that is without using CUDA). This 300% performance gain is pretty impressive already and I was wondering if it was worthwhile to rewrite the whole thing in CUDA. Would that bring further speed improvements?
I know the question is rather vague (the answer I guess depends on the details of the code) but I would be very happy to hear anyone’s experiences comparing OpenGL + Cg vs CUDA codes.
(Hey, this is the first Linux post, or what? :))