GeForce 9600GT vs Quadro FX5800

I ran the same CUDA program on two machines with different video cards, and found that GeForce 9600GT is almost twice faster than

Quadro FX5800. Is this normal?

GeForce 9600GT: 512MB video mem, run on XP 32bit, driver 195.62
Quadro Fx5800: 4GB video mem, run on Vista 32bit, driver 195.39

I use 16x16 block size on both machines. anybody has any suggestions how to improve the performance on Quadro FX5800?

Thanks!


Never mind. I found the problem. I copied all the necessary dlls and libs and headerfiles to a subfolder to avoid duplicate project setups in different machines, however i forgot to update them in the vista machine.