A somewhat controversial paper was presented at the ISCA conference this week:
Debunking the 100X GPU vs. CPU myth: an evaluation of throughput computing on CPU and GPU, by Victor Lee et al. from Intel.
I think it may be an interesting read for the CUDA developer community… (and it has been long since we last had a speedup measurement methodology debate :) )
The authors compare the performance of several parallel kernels on a Core i7 960 against a GTX 280. Kernels are highly-tuned on both sides.
They measure very reasonable speed-ups, from 0.5x to 14x, and 2.5x on average.
The papers follows on by analyzing the causes of suboptimal performance on both sides, and the implications on architecture design.
So here is the official PR answer from NV:
Unlike the blog poster, I would not question the fairness of Intel’s analysis. But he does have a point in claiming that the myth is that modern CPUs are easier to program than GPUs.
In this regard, it is interesting to note that Fermi’s improvements are mostly on the programmability side, and not that much on raw performance…
Any thoughts about this?