Tesla1060 vs GS8600

Hmm getting some strange results here. I’ve not fully tested everything but…
Tesla1060 vs GS8600… running my cuda code, the GS is faster!!!
Profiler on the same code different machine shows 60% GPU 40% memory stuff.

I should mention that this is a GeForce8600 not a GS8600… DOH

Anyway I’ve done a load more testing and basically all my code executes in the same time on either the tesla or the GeForce8600 which is somewhat disappointing really. I would have thought the difference in processor speed alone would have made a difference. Also I’m running far more threads than can be run in parallel on the 8600 so that should have made a huge difference. (and yes I am timing it properly).

Even running some of the SDK examples shows at best a marginal speed advantage to the tesla over the 8600.

I’ve attached a sample program and would be interested to know what time is reported for different cards…

The program simply computes 100,000,000 sigmoid functions. I doubt it’s particularly well optimised but thats not the point here.
sigArray_shared.cu (2.17 KB)

Actually, I think you will find you are simply measuring the time for the kernel launch to fail in both cases, which is why both cards “perform” the same (ie. nothing is happening in both cases). You really ought to re-read the section of the programming guide on execution parameters, because your block size, grid size, and shared memory combination is invalid, and maybe get into the habit of adding some error checking so you can catch problems like this…

Hi Avidday, oops, you are quite right. I had the program working a while back and obviously messed with it at some point without realising. Anyway I’ve fixed it and I do indeed get a nice speedup using the tesla over the geforce8600! so I’m happy again. Clearly I need to work on my cuda programming skills!

Thanks for pointing that out. If anyone is interested here is a working version of the program.

I get

Tesla - 0.141152 ms

GeForce8600 - 0.560864 ms

If I change N to N=4000000 i get

Tesla - 0.465056

GeForce8600 - 2.372608

If it works you should get output 0.5 0.50025 0.5005…

sigArray.cu (2.02 KB)