Concurrent access to struct fields

I have a struct containing several fileds. I was wondering if i will update these fileds from different threads (assuming each thread modifies only it’s own field) will I encounter race conditions or not?

If the fields are word sized and word aligned, no problem. If the fields are subword size or misaligned (ie bytes) then yes, simultaneous writes to two different fields from two threads in the same warp might collide.

Now the one complex caveat: if your structure is in shared memory, and you selected 64-bit shared memory mode, then the write word size is effectively 64 bits. Simultaneous writes by two threads of the same warp to any parts of the same aligned 64 bit word will indeed collide, and only one write will succeed.

Dupe deleted

This is exactly my case. What do you mean by selecting 64-bit shared memory mode. Compiling for x86_64?

By the way, I am using array as one field of the structure

struct __align__(4) node {
        int children[3];
        int info;
};

Will I have concurrent access from different threads to array elements if I will use proper alignment?

No. If you’re using Kepler’s 64 bit shared memory mode. You would activate this with the cudaDeviceSetSharedMemConfig() call in your host code.

In your example where fields are 32 bit words, if you do not have 64 bit memory mode activated, then your structure will work fine when threads update different fields simultaneously. If you do have 64 bit mode activated, then you will have a problem when two threads of the same warp simultaneously write to adjacent words that are in the same 64 bit bank.

allanmac sent me a message, and it’s less complicated than I thought, but still mysterious. The C programming guide isn’t quite clear what happens with subword writes in 64 bit mode. So I wrote a tool to actually test it, and found there was no problem in either 32 or 64 bit mode with writing 32 bit subwords.
So that seems safe.

But even more surprisingly, I also did not see any problem writing even 16 or 8 bit subwords to shared or global memory!

This empirical test was on a Kepler, so I wouldn’t trust that for Fermi or especially Tesla architectures.

So current summary after the empirical test: on a Kepler at least, your structure will have no problem with simultaneous writes. This may even be true, surprisingly, even if the fields are 16 or 8 bits wide.