I just installed a Tesla S1070, on a Dell XPS running Linux.
I installed the SDK, compiled the sample program “Template” and ran it, it reported a run-time of:
I have been previously developing on my laptop, Macbook Pro using a 8600M GT (much slower than a T10 I assume)… and run the Template sample program for a run time of:
Interestingly enough, running any other sample program produces more logical results of the T10 outperforming my 8600M GT, by far.
Unfortunately, I used the Template program provided as a basis for my program. My program provides superior performance using the GPU, than using a serial CPU based algorithm, when run on my laptop. However, when run on the Tesla S1070 with the Dell XPS running an i7 CPU, the CPU significantly outperforms the GPU algorithm.
I suspect that for whatever reason the template sample program (unmodified) runs slower on a T10 than on a 8600MGT, is the same reason my own program fails to provide the same results on the XPS machine, as when run on my laptop.
Any ideas as to why this is happening?