Hello everyone,
Based on this post https://stackoverflow.com/questions/15240432/does-syncthreads-synchronize-all-threads-in-the-grid, I can understand that this function is used to synchronize threads inside a block.
In my code, I used the function and I’m not sure if it’s correct, I need some guidance, here’s my kernel that adds two vectors.
__global__ void
vectorAdd (const float *A, const float *B, float *C, int numElements)
{
int nIter = 1000;
for (int k = 0; k < nIter; ++k)
{
int i = blockDim.x * blockIdx.x + threadIdx.x;
if (i < numElements)
{
C[i] = A[i] + B[i];
}
__syncthreads();
}
}
PS: I’m using the loop inside the kernel, because I want it to run for longer for power consumption measurements.