Newbie question about shared memory

Hello guys!

I’m trying to calculate the sum of a simple vector.

I think the most difficult concept to understand about OpenCL is the shared memory.

So, All threads on the same block share the “__local” memory, right?

To test this, I create a kernel the sums a vector using only one group.

And did the following kernel:

__kernel void VectorAdd(__global const float* vector, __global float* result, __local float *partResult)

{

	

	int id = get_global_id(0);

	

	if(id == 0)

		*partResult = 0;

	

	 barrier(CLK_LOCAL_MEM_FENCE);

	*partResult += vector[id];

	barrier(CLK_LOCAL_MEM_FENCE);

	if(id == get_global_size(0)-1)

		*result = *partResult;

	

}

I initialized the __local with 0, synchronize… Do the sum, synchronize and copy to the global memory.

What’s is wrong on this code?

Remembering that there’s only one group.

Thanks!