Different computing result on different GPU

Hi all,

I am running Needleman Wunsch algorithm on my Geforce 8400GS. The result is correct compare to my CPU version. But as I move the code to 8600, the result is different. 

Do you have any idea why this happens?

THANKS!!

With so little info, it’s most likely your programming bug. :-)
But if could be a compiler bug, a bad graphics card, bad drivers on one machine, just about anything… But most likely it’s your code.

Without knowing your implementation details, it’s hard to give advice.
You may try to save checkpointed values at various parts of your compute to compare and see where the results diverge. Iterate that as a manual binary search and you may see where something Bad happened. Maybe it’s simple like an accidental dependence on block ordering. Running against the emulator can also help.

If you’re truly stuck, you might try to simplify your code to isolate the “difference” until you get it as small and simple and fast as possible, then post the code.

Is there a difference in the compute capability of the 8400 and 8600? Could one be 1.0 and the other 1.1?

8400 is G86 and 8600 is G84, and both are Compute 1.1.

Just at this point, I’d guess programmer bug due to a lack of evidence to the contrary; could there be a subtle race condition between blocks that wasn’t showing up on a card where you ran fewer blocks simultaneously?