Loop is parallelizable

rotteweiler · June 8, 2010, 8:16pm

Hi again

I am a student trying to learn more about GPU’s but I have a few questions about the following code:

!$acc region
do k = 1, n1
do i = 1, n3
y=0
do j = 1, n2
y = y + a(i,j) * b(j,k)
enddo
c(i,k) = y
enddo
enddo
!$acc end region

This code comes from the matrix multiplication sample provided by PGI and I have tried running it but the innermost loop does not seem to be parallelized. If possible could someone help me completely parallelize all the loops? The message I receive is:

37, Loop is parallelizable
38, Loop is parallelizable
Accelerator kernel generated
37, !$acc do parallel, vector(16)
38, !$acc do parallel, vector(16)
CC 1.0 : 12 registers; 24 shared, 64 constant, 0 local memory bytes; 66 occupancy
CC 1.3 : 12 registers; 24 shared, 64 constant, 0 local memory bytes; 100 occupancy
41, Loop is parallelizable
57, Loop interchange produces reordered loop nest: 57,59,58

If you are wondering why this code has been rewritten from the original:

!$acc region
do k = 1,n1
do i = 1,n3
c(i,k) = 0.0
do j = 1,n2
c(i,k) = c(i,k) + a(i,j) * b(j,k)
enddo
enddo
enddo
!$acc end region

The reason is that when I tried to compile the original code, I would receive the following message:

60, Complex loop carried dependence of ‘c’ prevents parallelization
Loop carried reuse of ‘c’ prevents parallelization
Inner sequential loop scheduled on accelerator

(On a side note, variables x and m were not accepted in the loops for some obscure reason) Please let me know if anyone has come across those messages.

Thank you for your time!

-Chris

MatColgrove · June 9, 2010, 4:58pm

Hi Chris,

The inner loop is not parallelizable. Since you’re a student, I’ll let you ponder a bit as to why. Please let me know what you come up with. If you still don’t see it, I’ll give you a clue.

(On a side note, variables x and m were not accepted in the loops for some obscure reason) Please let me know if anyone has come across those messages.

How are you using x and m? As loop index variables? Did you declare them as integer? If you didn’t declare them, they will implicitly declared as real and can’t be used as index variables.

Mat

rotteweiler · June 10, 2010, 5:34pm

Thank you for you help Mat!

My guess will be that it is because there is a dependency on the last iteration so they must be done sequentially. I hope I hit bull’s eye ;p

-Chris