The following code is a test I’m trying to do to see how easy it’s going to be to adapt some serial scientific computing code to CUDA. I’m having some troubles getting it going. So could someone tell me if I’m doing this completely wrong, or if I can’t use device functions this way? (The math is irrelevant, I’m just trying to get something going to eat up time so I can see the improvement from using cuda.)
I’m also thinking I need to rework:
a = a + testFunc2(a, i)
as I’m not quite sure how that would execute and add up on the machine. In my non-cuda version of this I simply call testFunc1 and have a do loop from 1 to 10000 in that.
And on the line of:
b = testFunc1
There is actually a one hundred followed by a comma then another one hundred. For some reason the forum won’t allow me to enter a line like that.
program test
use cudafor
double precision a, b,testFunc1
real t0, t1, tdiff
call cpu_time(t0)
a = 1.0000000d0
b = testFunc1<<<100>>>(a)
call cpu_time(t1)
tdiff = t1 - t0
write(6,*) a, tdiff
call exit
end
module cudaFuncs
contains
double precision attributes(device) function testFunc1(a)
double precision a, testFunc2
integer i
i = (blockIdx%x-1)*blockDim%x + threadIdx%x
a = a + testFunc2(a, i)
call syncthreads()
testFunc1 = a
return
end
double precision attributes(device) function testFunc2(a, i)
double precision a, testFunc3
integer i, j
a = a * i
a = a / 3.d0
do j = 1, 10000
a = testFunc3(a)
enddo
testFunc2 = a
return
end
double precision attributes(device) function testFunc3(a)
double precision a
integer i
do i = 1, 10000
a = a * a
a = sqrt(a)
a = a * 3.d0
a = a / 5.d0
enddo
testFunc3 = a
return
end
end module cudaFuncs