Question about atomic operation

From the book, I see when we doing atomic operation,The hardware guarantees us that no other thread can read or write the value at address addr while we perform these operations.But it make me confused, for instance:

If I have two threads(thread0 and thread1). and I have a buffer call buf, the size of the buf is two(buf[0] and buf[1]).The address of buf[0] is 10000 and the address of buf[1] is 10001.

Now,I do atomicAdd(&buffer[threadIdx.x],1).So thread0 will add 1 into buf[0], thread1 will add 1 into buf[1].My question is, it’s happend meanwhile(since the address 10000 and 10001 is different)or thread1 will do operation after thread0 finish his job?

Thanks for your help

it is performed similar to simple loading/storing, i.e. in SIMD fashion with coalescing for global memory space and banking for shared memory

Im confused From what I understand you said:

all of the threads in a half-warp access shared memory at the same time which means thread0 access buf[0] and thread1 access buf[1] at same time. Then they will change buf[0] and buf[1] at same time. It’s right?

yes

btw, half-warp access was used in 10-year old GPUs afair, so you may need to read newer books :)

Thank you really helpful : )

note that your case is simple. and in more complex cases, coalescing/banking rules apply