Currently, atomic min only works for INTEGER data. So, for real data, what is your suggestion, if I want to find the min value from an array with each element is processed by an instance of the kernel.
Tuan
Hi Tuan,
Atomic operations need hardware support and Nvidia doesn’t support atomicmin operations floats, so were limited as to what we can do here. Do you really need the operation to be atomic or could the elemental min function work?
- Mat
Elemental min function working is good. Do you have any suggestion, Mat?
Tuan
If the elemental min function works, then I’d use it. If you must use the atomic min, then you’d need to change your array from REAL to INTEGER*4.
- Mat
Hi Mat,
The reason is that i want the min to work on GPU. However, I’m not sure if the elemental min guarantee the true minimum value among the threads? Could you please confirm this.
Example: suppose minval was assigned the MAXIMUM value before calling to the subroutine
attributes(global) subroutine foo(A, N, minval)
real, dimension(N,N) :: A
real :: minval
idx = threadIdx%x;
if (idx .le. N) then
min1 = min(A(idx,:))
minval = min(min1, minval)
endif
end subroutine
Tuan
Hi Tuan,
For arrays, you want to use the reduction intrinsic “minval”. Something like:
% cat minreduc.cuf
module minutil
real, device :: dMinval
real, device :: dMaxval
contains
attributes(global) subroutine foo (Ad,N)
use cudafor
implicit none
integer, value :: N
real, device, dimension(N,N) :: Ad
integer i, j, tx, ty
tx = threadidx%x
ty = threadidx%y
i = (blockidx%x-1)*16 + tx
j = (blockidx%y-1)*16 + ty
Ad(i,j) = (N*(i-1))+(j-1)
call syncthreads()
if (i .eq. 1 .and. j .eq. 1 ) then
dMinval = minval(Ad)
dMaxval = maxval(Ad)
endif
call syncthreads()
end subroutine foo
subroutine testmin ()
use cudafor
implicit none
integer :: N = 64
real, dimension(N,N) :: A
real, device, dimension(N,N) :: Ad
real :: minval, maxval
type(dim3) :: dimGrid, dimBlock
A=-1
dMinval = -1
dimGrid=dim3(N/16,N/16,1)
dimBlock=dim3(16,16,1)
Ad=0
call foo<<<dimGrid,dimBlock>>>(Ad,N)
A=Ad
minval = dMinval
maxval = dMaxval
print *, minval, maxval
print *, A(1,1), A(N,N)
end subroutine testmin
end module minutil
program testme
use minutil
call testmin
end program testme
% pgfortran -o minreduc.out minreduc.cuf
% minreduc.out
0.000000 4095.000
0.000000 4095.000
Hi Mat,
Is minval support double precision also, the document just say it supports real, so I’m not sure if this includes double precision?
Please explain me the difference of using min and minval? Does min cannot be used on GPU?
Thanks,
Tuan
Hi Tuan,
Is minval support double precision also, the document just say it supports real, so I’m not sure if this includes double precision?
Yes, minval supports double precision. When they doc say ‘real’, they mean both kinds.
Please explain me the difference of using min and minval?
min determines which of two or more scalar values is the minimum values. minval finds the minimum value of an array.
Does min cannot be used on GPU?
min can be used on the GPU. Though, your code is trying to find the minimum value of an array, hence the use of minval.
- Mat
All clear. Thank Mat.
Tuan