Hi Everyone;
I am new about cuda and I tried to write first cuda code using += operator.
The kernel is following:
[codebox]global void SumArray(float *a, float *c, const unsigned int N)
{
unsigned int i = threadIdx.x + blockIdx.x * blockDim.x;
unsigned int j = threadIdx.y + blockIdx.y * blockDim.y;
unsigned int index = i * N + j;
float cSub = 0;
cSub += a[index];
__syncthreads();
c[0] = cSub;
return;
}[/codebox]
The kernel returns cSub = 1 (all elements of a are 1).Each thread has
own cSub value so the results might be logic, but I use syncthreads()
commands for this situation. I have Quaro FX 5600 and I use Cuda 2.0.
Thanks for advices…