Cuda 1.1 on 8800 Ultra which hardware to use best?

I am new to GPU programming and currently searching for the right hardware to buy for cuda 1.1 programming.
I have read that not all cards of the 8800 series support atomic operations. Furthermore a friend of my who is using a 8800 gtx told me he had a lot trouble with thread syncronisation because of missing atomic operations.
So far I have only found a thread which states that the 8800 GT has atomic operation support.
I am interested in purchasing a 8800 Ultra because I have large memory bandwith requirements. Therefore my question to you guys, does the 8800 Ultra support atomic operations and cuda 1.1. Possibly somebody knows where I can find a compatibility list of the 8800 series.


Atomics are available only on 8600GT, 8600GTS and 8800GT. All theese cards have compute capability 1.1.

All other cards have compute capability 1.0 and atomics are not supported on them.

If you have large memory bandwidth needs, then atomic operations are not for you anyways. Getting the most out of the memory bandwidth requires fully coalesced reads (or very good data locality with textures) and fully coalesced writes. While it is possible to have coalesced atomic operations, one usually needs the atomic operation when there are multiple threads accessing the same data element and that is not typically coalesced.

Global thread synchronization is practically impossible, even with atomic ops, and should be avoided at all costs for performance reasons. Just design your algorithm as if every thread is executing simultaneously, and you will have no problems.