how can i make a parallel addition or a serial part at kernel?
// int* a array of integer - every block are referenced to one field
__device mykernel(int* t)
//huge parallel tasks
//each thread calculates 1 value
// now… all thread calculations must be summed, but how?
// t[blockidx.x] = sum of all threads at block
// i think
// t[blockidx.x]+= value of thread does’t works (only sum of last thread are saved)
i hope for help :)