atomic min/max for real data

Currently, atomic min only works for INTEGER data. So, for real data, what is your suggestion, if I want to find the min value from an array with each element is processed by an instance of the kernel.



Tuan

Hi Tuan,

Atomic operations need hardware support and Nvidia doesn’t support atomicmin operations floats, so were limited as to what we can do here. Do you really need the operation to be atomic or could the elemental min function work?

  • Mat

Elemental min function working is good. Do you have any suggestion, Mat?

Tuan

If the elemental min function works, then I’d use it. If you must use the atomic min, then you’d need to change your array from REAL to INTEGER*4.

  • Mat

Hi Mat,

The reason is that i want the min to work on GPU. However, I’m not sure if the elemental min guarantee the true minimum value among the threads? Could you please confirm this.

Example: suppose minval was assigned the MAXIMUM value before calling to the subroutine

attributes(global) subroutine foo(A, N, minval)
 real, dimension(N,N) :: A
 real :: minval

 idx = threadIdx%x;
 if (idx .le. N) then
   min1 = min(A(idx,:))
   minval = min(min1, minval)
 endif

end subroutine

Tuan

Hi Tuan,

For arrays, you want to use the reduction intrinsic “minval”. Something like:

% cat minreduc.cuf

module minutil

  real, device :: dMinval
  real, device :: dMaxval

contains

  attributes(global) subroutine foo (Ad,N)
    use cudafor
    implicit none

    integer, value :: N
    real, device, dimension(N,N) :: Ad
    integer i, j, tx, ty

    tx = threadidx%x
    ty = threadidx%y
    i = (blockidx%x-1)*16 + tx
    j = (blockidx%y-1)*16 + ty

    Ad(i,j) =  (N*(i-1))+(j-1)

    call syncthreads()
    if (i .eq. 1 .and. j .eq. 1 ) then
       dMinval = minval(Ad)
       dMaxval = maxval(Ad)
    endif
    call syncthreads()

  end subroutine foo

  subroutine testmin ()
    use cudafor
    implicit none
    integer :: N = 64
    real, dimension(N,N) :: A
    real, device, dimension(N,N) :: Ad
    real :: minval, maxval
    type(dim3) :: dimGrid, dimBlock
    A=-1
    dMinval = -1
    dimGrid=dim3(N/16,N/16,1)
    dimBlock=dim3(16,16,1)
    Ad=0
    call foo<<<dimGrid,dimBlock>>>(Ad,N)
    A=Ad
    minval = dMinval
    maxval = dMaxval
    print *, minval, maxval
    print *, A(1,1), A(N,N)
  end subroutine testmin

end module minutil

program testme

  use minutil

  call testmin

end program testme

% pgfortran -o minreduc.out minreduc.cuf
% minreduc.out
    0.000000        4095.000
    0.000000        4095.000

Hi Mat,
Is minval support double precision also, the document just say it supports real, so I’m not sure if this includes double precision?
Please explain me the difference of using min and minval? Does min cannot be used on GPU?

Thanks,
Tuan

Hi Tuan,

Is minval support double precision also, the document just say it supports real, so I’m not sure if this includes double precision?

Yes, minval supports double precision. When they doc say ‘real’, they mean both kinds.

Please explain me the difference of using min and minval?

min determines which of two or more scalar values is the minimum values. minval finds the minimum value of an array.

Does min cannot be used on GPU?

min can be used on the GPU. Though, your code is trying to find the minimum value of an array, hence the use of minval.

  • Mat

All clear. Thank Mat.

Tuan