Hello,
i want to increment my acculumator cell and i am using atomicInc(…).
My accumulator provides 2 Byte for each cell, so i have to check if
the value i want to increment is still not higher than 32767.
What’s the best way to do this?
So i need an atomicInc function with a condition which checks the threshold or something like this.
And of course i know that an atomicInc is not the best opportunity for
incrementing accumulator cells, but in the first step it’s the simplest way and
i want to performance measurement with this approach.
increment and then decrement on overflow
if i increment with atomic add and the value is 32767 than overflow occurs because i can’t stop the incrementation in between.
yes, i was incorrect. on overflow, you just need to set result to 32767 unconditionally
alternatively, you can use atomicCAS, which is developed exactly for any complex computations
but i can’t stop the atomicAdd in between…
*val = 32676
atomicInc(val, 1);
so the program terminate because it’s an overflow.
If i read the value on the adress val before, it can be possible that more than one thread
reads the value and test the condition if val < 32676 and try to atomicInc the value.
It can be possible that the overflow occurs.
I need a good possibility, that each thread only increment if the actual value is <32676 and
each thread need consistent data
program doesn’t terminate on overflow, it just calculates 32767+1 as -32768
i also proposed the solution based on atomicCAS
so i need to check if the value is 32767 or <0?
how does the solution with atomicCAS works?
As already indicated, you can build “custom atomic” functions based on atomicCAS. The programming guide demonstrates a sample “custom atomic” that implements double atomicAdd on devices that don’t natively support that operation:
[url]Programming Guide :: CUDA Toolkit Documentation
that example could be modified to perform your operation. Here’s another “custom atomic” example that updates two separate 32-bit (packed) variables, based on a min-test of one of them:
[url]cuda - How can I implement a custom atomic function involving several variables? - Stack Overflow
If your 16-bit quantities are packed (e.g. an array of short, for example), then you can still do atomics on them by working with an ordinary int quantity, but masking/updating only the necessary bits.