Hi all, still quite new to CUDA. As the title says i’m looking for a way to find the min value in a 2d array. Using the reduction method in the SDK Browser, i’ve managed to get it working for the first row, I just don’t know the right way to build up the algorithm to support finding the value over many rows.
My kernel is as follows:
__global__ void test_ker(float *b, float *o, int n) {
__shared__ int value[32];// allocated on invocation
// perform first level of reduction,
// reading from global memory, writing to shared memory
unsigned int tid = threadIdx.x;
unsigned int i = blockIdx.x*(blockDim.x*2) + threadIdx.x;
value[tid] = fminf(b[i], b[i+blockDim.x]);
__syncthreads();
// do reduction in shared mem
for(unsigned int s=blockDim.x/2; s>0; s>>=1)
{
if (tid < s)
{
value[tid] = fminf(value[tid], value[tid+s]);
}
__syncthreads();
}
// write result for this block to global mem
if (tid == 0) o[blockIdx.x] = value[0];
}
Any help would be very apreciated on this topic. Many thanks in advance!