I have run the Dot Product program from Nvidia’s sample codes. I also write a Dot Product program to compute the similar amount of data (five millions) for a single CPU using Visual Studio without OpenCL. The performance of the later is ten time faster.
Is there any suggestion for me to make the performance speedup in OpenCL?
To my understanding, a GeForce 9200M GS is only half of a 9400M, so it has only 1 MP (total 8 “cores”). That means it probably has no more computation power than your Core 2 Duo. With the extra overhead of calling a OpenCL kernel, copying memory, etc. it’s normal that you don’t see much performance out of it.