The following code is a test I’m trying to do to see how easy it’s going to be to adapt some serial scientific computing code to CUDA. I’m having some troubles getting it going. So could someone tell me if I’m doing this completely wrong, or if I can’t use device functions this way? (The math is irrelevant, I’m just trying to get something going to eat up time so I can see the improvement from using cuda.)
I’m also thinking I need to rework:
a = a + testFunc2(a, i)
as I’m not quite sure how that would execute and add up on the machine. In my non-cuda version of this I simply call testFunc1 and have a do loop from 1 to 10000 in that.
And on the line of:
b = testFunc1
There is actually a one hundred followed by a comma then another one hundred. For some reason the forum won’t allow me to enter a line like that.
program test use cudafor double precision a, b,testFunc1 real t0, t1, tdiff call cpu_time(t0) a = 1.0000000d0 b = testFunc1<<<100>>>(a) call cpu_time(t1) tdiff = t1 - t0 write(6,*) a, tdiff call exit end module cudaFuncs contains double precision attributes(device) function testFunc1(a) double precision a, testFunc2 integer i i = (blockIdx%x-1)*blockDim%x + threadIdx%x a = a + testFunc2(a, i) call syncthreads() testFunc1 = a return end double precision attributes(device) function testFunc2(a, i) double precision a, testFunc3 integer i, j a = a * i a = a / 3.d0 do j = 1, 10000 a = testFunc3(a) enddo testFunc2 = a return end double precision attributes(device) function testFunc3(a) double precision a integer i do i = 1, 10000 a = a * a a = sqrt(a) a = a * 3.d0 a = a / 5.d0 enddo testFunc3 = a return end end module cudaFuncs