Incorrect output in VS 2005, but correct output in VS 2010.

Does anyone know why CUDA produces an incorrect result in VS 2005 (please see below), but a correct one in VS 2010? Is this a setup issue? I used the SDK 4.2 template solution for both. (SDK 4.1 behaved similarly.)

The following same code produces an incorrect output “1 + 1 = 1638180” in VS2005, while it gives the correct output “1 + 1 = 2” in VS2010:

#include “stdio.h”

global void d(int x, int y, int *z)
{
*z = x + y;
}

int main(void)
{
int a=1, b=1, c, *c_device;

cudaMalloc((void**)&c_device, sizeof(int));
d(a, b, c_device);
cudaMemcpy(&c, c_device, sizeof(int), cudaMemcpyDeviceToHost);
printf("%d + %d = %d\n", a, b, c);

cudaFree(c_device);

return 0;

}

It worked fine after I uninstalled & re-installed the CUDA toolkit (SDK 4.2).