I need to do atomic compare and swap operations on two fields at once, a pointer and a boolean.
Sort of like what AtomicMarkableReference offers in Java.
However, I am well aware that CUDA does not provide any atomic multi CAS operations.
One way to get around this is to use the last bit of the pointer as a mark bit assuming that it is unused because the pointers to allocated memory are aligned in a certain way.
Here is what it says in a paper that I was reading:
In many modern architectures, a 32-bit word that stores a pointer has two unused bits. One of those can be used to store the mark bit....
So my questions are:
Is this safe to assume for CUDA too? If not can I use atomicCAS(long long…) version to acomplish this?
How can I use bitwise operations to set the last bit of a 32-bit pointer to 0 or 1? I haven’t used them before so I’m kind of lost.
Thanks for any suggestions.