difference between $acc region vs. $acc data region

Tuan · November 21, 2009, 1:09am

Can anyone help me to clarify the difference between using
$acc region

and

$acc data region

It looks the same to me.

Another question is how to use the declarative data directives to make the copy of the function’s argument to the device memory so that it can be use in different accelerated data region.

Tuan

TheMatt · November 23, 2009, 1:51pm

Tuan,

I’m going to give your first question a shot, but it’s possible that I’m slightly off. I’m sure the PGI folks will weigh in soon at which point my post might disappear/be edited.

‘!$acc region’ denotes a compute region, whereas ‘!$acc data region’ denotes a data region. You are right in thinking there is some overlap, but the difference is that a data region only performs memory movement and data allocation.

That is, if you issue an ‘!$acc data region’ you are telling the compiler what the overall data movement and allocation for the region will be. If you call it with ‘copy’ or ‘copyin’ clauses, that memory will be allocated on the device and that data copied over. ‘copyout’ tells the compiler that memory will be allocated, but the movement will occur at the end.

BUT, using a data region call does not mean that any GPU computation will occur within. All the ‘$acc data region’ does is say we are moving data. That’s it. To have actual computation occur on the GPU one needs to issue an ‘!$acc region’ or similar compute region.

I think the confusion is that one can have–and often does have–data movement and memory allocation by just using ‘!$acc region’. If you issue this command without a data region, the compiler will (rightly) assume it needs to move data onto the device so it can do the computation and off of it to return results. And with 9.0, that was how you did it.

The advantage to the data region, in my mind, is that one can have an over-arching data region with many compute regions within. This allows for much more efficient (and hopefully less) data movement, since without it, multiple compute regions could mean multiple data moves.

I’m a bit confused as to what you are asking in your second question, though.

MatColgrove · November 23, 2009, 5:45pm

Hi Tuan,

Matt does a good job of explaining the difference between an Accelerator region and a data region.

As for your second question, I think the “Reflected” and “mirrored” data region clauses are what you’re looking for. However, our engineers we’re able to add these quite yet so data regions are currently only supported within the same function as an accelerator region. We should be able to added this some time in Spring 2010 and then data regions will be able to span functions calls (Fortran only).

Mat

TheMatt · November 23, 2009, 6:15pm

Ah. Now I get what those will be for…and very useful they will be!

Tuan · November 24, 2009, 2:40am

Thanks both Mats for the clear explanation.

Consider this code snippet. If A, B, C are used extensively and for read-only purposes.

!$acc data region copyin (A, B, C)
  for i = 1, 100
    !$acc region
    for j = 1, M
       for k = 1, N
         X(j, k) = A(j, k) * B(j, k) + C(j, k)
       enddo
   enddo
   !$acc end region

   ..... ! other codes (no modify A, B, and C)
  enddo
!$acc end data region

I expect this code to run much faster than this

  for i = 1, 100
    !$acc region
    for j = 1, M
       for k = 1, N
         X(j, k) = A(j, k) * B(j, k) + C(j, k)
       enddo
   enddo
   !$acc end region

   ..... ! other codes (no modify A, B, and C)
  enddo

What is your opinions?

MatColgrove · November 24, 2009, 4:43pm

Yes, using the data region here will be much faster since A, B, and C will only be copied to the GPU once. Without the data region they would be copied 100 times.

Mat