Does writing an int to global memory have atomicity?

Multiple threads from different blocks simultaneously write an int to the same address. I don’t care which thread ultimately wrote this result, only whether the written int is intact. That is, all four bytes corresponding to the int are the result of the same thread’s write, rather than some bytes written by thread A and some bytes written by thread B. Can this be guaranteed using regular write operations (e.g., *p=32)?

Cuda architecture: Ampere and Hopper