OpenCL vs Cuda C performance - nBody sample nbody sample for Cuda C much faster than OpenCL

Sorry if this seems like a dumb question, but I am new to this (but pretty excited about it :) ) - I was wondering, I ran the OpenCL nbody example and the Cuda C nbody example on a mchine with 2 Qudro FX 1700 cards (the two shouldn’t matter, but I thought just in case) and the OpenCL was significantly slower. Would this be expected right now? I could be wrong, but my guess is because OpenCL - OpenGL interop is not supported. Either way - just curious and was hoping that someone might be able to tell me. The only reason I ask is that the sample OpenCL video showed the Nbody sample moving a lot faster, but it is most likely because I have a weaker card in my system than the one used there.

Thank you very much!

probably a matter of this being the first official implementation of OpenCL, many samples are quite slower (volume rendering is quite drastic too)… and some don’t work at all (at least on my 9600)
the latest GPU SDK doesn’t hint that opencl/opengl interop was not supported.

it will take a bit until they have optimized it as well as CUDA.

I would really like to see this addressed as well, I am wanting to switch over to OpenCL from CUDA but right now the performance is not acceptable.