Thanks Mat, we may be getting close! I made the change you suggested. Now there is a different problem, which is important to my work.
nvfortran -V23.11 -cuda tred.f90 -L/opt/nvidia/hpc_sdk//Linux_x86_64/23.11/math_libs/12.3/lib64/ -lcurand
malcolm31 !a
a.out
Max along any dimension 4.586379140905350
malcolm32 !v
vi tred.f90
malcolm33 a.out
Max along any dimension 4.586379140905350
malcolm34 a.out
Max along any dimension 4.586379140905350
malcolm35 a.out
Max along any dimension 4.586379140905350
malcolm36 a.out
Max along any dimension 1.3679363100023353E+241
malcolm37 a.out
Max along any dimension 1.3727972168158840E+241
malcolm38 a.out
Max along any dimension 4.586379140905350
As you can see, the random_number function is not changing from call to call (except when it goes into the stratosphere!). I am under the impression that it should - by default, being reset by some system-dependent mechanism.
Malcolm.
Assuming this is the same code as above, “c” is uninitialized so it’s initial value will be what ever is happens to be in memory at the time. If it’s a big number, then that’s what you’d get.
Try setting c to something small, like 0, and see if it fixes the problem:
real(8) :: c
c=0.
call random_number(a)
do idim = 1, 5
b = sum(a, dim=idim)
c = max(maxval(b), c)
end do
Note that you’re not using cuRAND here, just the normal host side random number generator. For an example on how to use cuRAND, please see the example in:
rm a.out
malcolm62 !nvf
nvfortran -V23.11 tred.f90 -L/opt/nvidia/hpc_sdk//Linux_x86_64/23.11/math_libs/12.3/lib64/ -lcurand
malcolm63 a.out
Max along any dimension 4.586379140905350
malcolm64 a.out
Max along any dimension 4.586379140905350
malcolm65 a.out
Max along any dimension 4.586379140905350
malcolm66 a.out
Max along any dimension 4.586379140905350
malcolm67 a.out
Max along any dimension 4.586379140905350
What’s particularly surprising is that it is giving the same reasonable answer as before!
I am specifically trying to keep things simple, hence the absence of any cuda.
Oh, sorry I was just focused on the big numbers. The same number printing each time is expected since it uses a static value for the seed. You’ll want to add a call to random_seed. Without an argument, the initial seed is set as function of time.
Mat, back to the simple tred.f90 example. I modified it to include the random_seed statement and repeat. The code:
program multidimred
use cudafor
! real(8), managed :: a(5,5,5,5,5)
! real(8), managed :: b(5,5,5,5)
real(8) :: a(5,5,5,5,5)
real(8) :: b(5,5,5,5)
real(8) :: c
c = 0.0
do irep = 1,10
call random_seed()
call random_number(a)
do idim = 1, 5
b = sum(a, dim=idim)
c = max(maxval(b), c)
end do
print *,“Max along any dimension”,c
end do
end program
I first ran it with the seed call before the irep statement and then ran it inside the loop. Results:
.out
Max along any dimension 4.352975488105642
Max along any dimension 4.558500935106593
Max along any dimension 4.558500935106593
Max along any dimension 4.558500935106593
Max along any dimension 4.558500935106593
Max along any dimension 4.558500935106593
Max along any dimension 4.558500935106593
Max along any dimension 4.583771852252923
Max along any dimension 4.583771852252923
Max along any dimension 4.583771852252923
malcolm34 vi tred.f90
malcolm35 !nvf
nvfortran -V23.11 tred.f90 -L/opt/nvidia/hpc_sdk//Linux_x86_64/23.11/math_libs/12.3/lib64/ -lcurand
malcolm36 !a
a.out
Max along any dimension 4.525450697040512
Max along any dimension 4.525450697040512
Max along any dimension 4.525450697040512
Max along any dimension 4.525450697040512
Max along any dimension 4.525450697040512
Max along any dimension 4.525450697040512
Max along any dimension 4.525450697040512
Max along any dimension 4.525450697040512
Max along any dimension 4.525450697040512
Max along any dimension 4.525450697040512
This is the opposite result from what I expected. Malcolm.
Yes, as before, because you’re setting NVHPC_CUDA_HOME, you can’t use the -cudalib flag. You need to edit the makefile to use “-L/opt/nvidia/hpc_sdk//Linux_x86_64/23.11/math_libs/12.3/lib64/ -lcurand”
For the updated program, you need to reset c to zero each time through the irep loop else you’re getting the max value across all iterations, not the one set of dimensions.
Also, only call random_seed the one time at the beginning. As I said above, the initial seed when called without arguments is a function of time. Since the program so quick, the unit of time hasn’t changed so the seed is reset back to where it was.
Thanks Mat. I suspected what you said about the unit of time not changing fast enough relative to the speed of the program. I will proceed with your load suggestion tomorrow. I’m on the East Coast so it’s getting late! Malcolm.