The use of _syncthreads()

Hello everyone,

Based on this post https://stackoverflow.com/questions/15240432/does-syncthreads-synchronize-all-threads-in-the-grid, I can understand that this function is used to synchronize threads inside a block.

In my code, I used the function and I’m not sure if it’s correct, I need some guidance, here’s my kernel that adds two vectors.

__global__ void
vectorAdd (const float *A, const float *B, float *C, int numElements)
{
    int nIter = 1000;

    for (int k = 0; k < nIter; ++k)
    {
         int i = blockDim.x * blockIdx.x + threadIdx.x;

          if (i < numElements)
          {
              C[i] = A[i] + B[i];
          }
          __syncthreads();
    }

}

PS: I’m using the loop inside the kernel, because I want it to run for longer for power consumption measurements.

syncthreads isn’t needed there at all

it serves no useful purpose

however your usage there is not illegal

Thank you for your quick response.

Yet , I saw impact on time, it increases when using it : especially for small kernel like this one.

So better take it off, because it costs regarding the little time small kernels take.