invalid device function error with GTX480

I just purchased a GTX480 for testing and installed it as my second graphics card, with the first being a 8800GT with 1GB.

My cuda program works with the 8800GT but fails with the GTX480 with kernel launch failure with “invalid device function.”

I use cudaSetDevice() to set the device, passed via cmd line arg. GTX480 is device 0, and 8800GT is device 1.

The failed kernel works previously in the code, and only fails the second time it is called, although with a different device memory pointer.

All pointers passed to that kernel are for device memory, allocated shortly after setting the device. Pointers are also used elsewhere in the code, before this particular kernel.

I use cudaGetDevice() in the kernel to confirm it is working on the correct device.

I am wondering if it is an issue with the different capabilities of the GTX480 versus the 8800GT. Any ideas?

It appears to be an issue with cudpp. Removing it and using another reduction routine solved the problem.