Questions with shared memory

Hello everybody. I’m newbie in cuda C and I am not understand how can I use shared memory in cuda.
So I have a kernel like this:

global void start ( )


__shared__ int count;




cuPrintf("%d\n", count );


The problem is: I can’t increment the “count variable”.
In each execution of thread it take a value between 0 and 1.
Please, show-me how can I do that?

I’m Brasilia. Excuse me for possible grammar errors.
Thanks so much.

You would need to use atomicAdd() in this case where multiple threads are accessing the same variable in parallel. atomicAdd() on shared memory is only available from compute capability 1.2 onwards though.

ohh thanks for the tips.

but, if I use a vector? and if I wanted that a group of threads increment the index[0] and another group incrementing index[5] for example?

thanks again.

Still use atomicAdd().