Same worksharing type in nested loops - parallel construct

sWienke · February 21, 2013, 9:02am

Hi,
I can specify a “gang vector” loop schedule for both loop parts within a nested loop while using the kernels construct:

#pragma acc kernels
#pragma acc loop gang vector
        for( int j = 0; j < n; j++)
        {
#pragma acc loop gang vector
            for( int i = 0; i < m; i++ ) {...}
         }

Then the compiler uses a 2 dimensional grid and 2 dimensional blocks (that is exactly what I want):

         67, #pragma acc loop gang, vector(2) /* blockIdx.y threadIdx.y */
         70, #pragma acc loop gang, vector(128) /* blockIdx.x threadIdx.x */

HOWEVER, if I use the parallel construct instead of kernels, I get an error message and the inner loop schedule will be ignored:

PGC-S-0155-Nested loops cannot have the same worksharing type  (file.c: 67)
[..]
67, #pragma acc loop gang, vector(256) /* blockIdx.x threadIdx.x */

Why do I get this error when it apparently workd nicely (and as expected) with the kernels construct?
How can I get 2 dimensional grids and 2 dimensional blocks with the parallel construct?
Bye, Sandra

sWienke · February 28, 2013, 11:14am

Any news?

mwolfe · February 28, 2013, 10:57pm

Sandra: This is defined behavior for the parallel construct. It’s more like the OpenMP loop construct (omp for or omp do). The kernels construct essentially allows tiling. For the parallel construct, we’re adding an explicit tile clause for nested loops in the next OpenACC version which should give you the behavior you want.

Topic		Replies	Views
paralle + independent and kernels + vector_length() Legacy PGI Compilers	5	4038	August 20, 2012
Computing multiple elements per thread in OpenACC Legacy PGI Compilers	3	2434	May 17, 2013
OpenACC and nested loops Legacy PGI Compilers	2	4026	September 19, 2014
Inner sequential loop scheduled on accelerator Legacy PGI Compilers	3	2448	October 21, 2015
From four nested loops to 3D blocks Legacy PGI Compilers	2	5960	June 23, 2014
grouping specific loops into a kernel Legacy PGI Compilers	1	1751	May 7, 2013
Loop "too deeply nested" and "data dependency Legacy PGI Compilers	9	10588	November 27, 2017
does "acc loop seq" work Legacy PGI Compilers	2	3956	October 3, 2012
Clause 'Worker(value)' not allowed in 'Parallel Loop' direct Legacy PGI Compilers	2	1829	April 17, 2018
Difference of using "acc parallel loop" and " Legacy PGI Compilers	3	2855	July 29, 2015

Same worksharing type in nested loops - parallel construct

Related topics