I’m writing an article on CUDA for gamedev.net and am having trouble with my numbers, why are my numbers so bad? Here’s the math:
CPU = 3Ghz
8800gtx = 575Mhz
The program is 30 ops. Running this operation 3 million times on my CPU should take 35.5ms, instead it takes 702ms.
The data transfer is 8 ops total (read and write). Running this operation 3 million times on my 8800gtx should take 42 ms, instead it takes 38ms.
It’s one read or write per cycle on the GPU right, how am I going faster? Why are the CPU numbers so bad? Are there any articles or discussions about this stuff?