Results wrong in GTX 280 but ok in GeForce 8600 GT CUDA 2.1 (64 bits)

I have a program that produces the correct results in a GeForce 8600 GT, but produces wrong results in a GTX 280 (with random behaviour).

Similar problems may be found here…rt=#entry537419

People are blaming sync problems, so I tryed to place __syncthread() after each instruction of the kernels (to make sure it wasn’t a sync problem) - that still didn’t help.

Any ideas on what the problem may be?


I already found the problem in my kernel (it was a syncronization problem).