As I am debugging my code, a thought occurred to me about the code I was looking at. Namely, should I accelerate code (with the pragmas, or perhaps with CUDA at all) that involves exponentiation? Say I have a code fragment like:
do ik=2,6 do k=0,np do i=1,m aa(i,k,ik) = aa(i,k,ik-1)**6 enddo enddo enddo
Would this be worth accelerating?
Or should I “unroll” the exponentiation so that it involves only multiplications:
do ik=2,6 do k=0,np do i=1,m aa(i,k,ik) = aa(i,k,ik-1)*aa(i,k,ik-1)*aa(i,k,ik-1)*aa(i,k,ik-1)*aa(i,k,ik-1)*aa(i,k,ik-1) enddo enddo enddo
It’s possible the compiler would do this automatically, but perhaps not. And perhaps the Accelerator pragmas prefer to see multiplies instead of powers (since multiply is a “simple” floating point instruction)? And, perhaps, this is the kind of thing that a GPU just shouldn’t do?!
As you can tell, I’m not a computer engineer, but a scientist by trade, so I’m still getting used to this “thinking” about my programming rather than just transcribing equations and brute forcing.