Character limit on directives?

There appears to be a limit of about 200 characters on the length of compiler directives. Going over this limit produces the error messages:

Syntax error: Recovery attempted by deleting
Syntax error: Recovery attempted by inserting before ‘{’

Is there a way to get around this? I have a lot of variables to specify in a data region directive. Splitting over multiple lines using a backslash doesn’t seem to have any effect.

Thanks

Here’s what I want:

#pragma acc data region local(z[flox:fhix][floy:fhiy], d[flox:fhix][floy:fhiy], q[flox:fhix][floy:fhiy], r[flox:fhix][floy:fhiy]), copy(x[flox:fhix][floy:fhiy]), copyin(b[flox:fhix][floy:fhiy], aW[flox:fhix][floy:fhiy], aS[flox:fhix][floy:fhiy], pC[flox:fhix][floy:fhiy], pS[flox:fhix][floy:fhiy], pW[flox:fhix][floy:fhiy])

Hi BeachHut,

I think something else is wrong since you should be able to go well beyond 200 characters. Can you post a small example or send on to PGI Customer Service (trs@pgroup.com)?

Thanks,
Mat

Hi Mat,

Looking at it again I see that it would actually be significantly longer than 200 characters since flox, fhix, etc. are replaced by quite long expressions by the pre-processor. I’d still like to know if there is a way around this without altering my code so that the replaced expressions are shorter (which I admit wouldn’t be difficult to do).

Here is a minimal code for demonstration:
Compile with -DLONG to generate the error.

/* Minimal example of compiler directive length problem */

#define ARRAYSTART 0

#define sx 128
#define sy 128
#define OLx 1
#define OLy 1

#define lox (ARRAYSTART + OLx)
#define hix (lox + sx - 1)
#define loy (ARRAYSTART + OLy)
#define hiy (loy + sy - 1)


#ifdef LONG

#define flox (lox - OLx)
#define fhix (hix + OLx)
#define floy (loy - OLy)
#define fhiy (hiy + OLy)

#else

#define flox 0
#define fhix 100
#define floy 0
#define fhiy 100

#endif


int
main (void)
{

  int i, j;
  double z[fhix][fhiy];
  double d[fhix][fhiy];
  double q[fhix][fhiy];
  double r[fhix][fhiy];
  double x[fhix][fhiy];
  double b[fhix][fhiy];
  double aW[fhix][fhiy];
  double aS[fhix][fhiy];
  double pC[fhix][fhiy];
  double pS[fhix][fhiy];
  double pW[fhix][fhiy];

  for (i = flox; i < fhix; i++)
    {
      for (j = floy; j < fhiy; j++)
	{
	  x[i][j] = 1.0;
	  b[i][j] = 1.0;
	  aW[i][j] = 1.0;
	  aS[i][j] = 1.0;
	  pC[i][j] = 1.0;
	  pS[i][j] = 1.0;
	  pW[i][j] = 1.0;
	}
    }

#pragma acc data region local(z[flox:fhix][floy:fhiy], d[flox:fhix][floy:fhiy], q[flox:fhix][floy:fhiy], r[flox:fhix][floy:fhiy]), copy(x[flox:fhix][floy:fhiy]), copyin(b[flox:fhix][floy:fhiy], aW[flox:fhix][floy:fhiy], aS[flox:fhix][floy:fhiy], pC[flox:fhix][floy:fhiy], pS[flox:fhix][floy:fhiy], pW[flox:fhix][floy:fhiy])
  {

#pragma acc region
    {

      for (i = flox; i < fhix; i++)
	{
	  for (j = floy; j < fhiy; j++)
	    {

	      z[i][j] = 1.0;
	      d[i][j] = 1.0;
	      q[i][j] = 1.0;
	      r[i][j] = 1.0;

	      x[i][j] = z[i][j] + d[i][j] + q[i][j] + r[i][j]
		+ x[i][j]
		+ b[i][j] + aW[i][j] + aS[i][j]
		+ pC[i][j] + pS[i][j] + pW[i][j];
	    }
	}

    }

  }


  for (i = flox; i < fhix; i++)
    {
      for (j = floy; j < fhiy; j++)
	{
	  printf ("%f ", x[i][j]);
	}
      printf ("\n");
    }

}

Ha, I implemented a quick fix for the above problem (just created new const ints that were equal to the macro values flox, etc.) but now I get another error:

Source file too large to compile at this optimization level

It’s just a plain Conjugate Gradient code and the file in question only has 286 lines… Surely this shouldn’t be pushing the boundaries?

Hi BeachHut,

Thank you for the example code. I was able to recreate the problem and have submitted a technical problem report (TPR#16769) and have sent it to our engineers. Note that this is a general limit with the preprocessor rather then something specific with the PGI Accelerator model. Hopefully our engineers can adjust the preprocessor to allow for longer lines.

For a work around, I would suggest removing the “copy” and “copyin” lines since the compiler is able to auto-detect these.

#pragma acc data region local(z[flox:fhix][floy:fhiy], d[flox:fhix][floy:fhiy], q[flox:fhix][floy:fhiy], r[flox:fhix][floy:fhiy])
  {



% pgcc long.c -ta=nvidia -DLONG -fast -Minfo=accel -o long.out
main:
     66, Generating local(r[:129][:129])
         Generating local(q[:129][:129])
         Generating local(d[:129][:129])
         Generating local(z[:129][:129])
     69, Generating copy(x[0:128][0:128])
         Generating copyin(b[0:128][0:128])
         Generating copyin(aW[0:128][0:128])
         Generating copyin(aS[0:128][0:128])
         Generating copyin(pC[0:128][0:128])
         Generating copyin(pS[0:128][0:128])
         Generating copyin(pW[0:128][0:128])
         Generating compute capability 1.3 kernel
     71, Loop is parallelizable
         Accelerator kernel generated
         71, #pragma acc for parallel, vector(128)
     73, Loop is parallelizable

As for the “source file too large” error, this means the compiler ran out of memory. Do you get the error with the above code? What flags are you using? I’ll probably need another example since I’m not able to recreate it with the above example.

  • Mat

There is a loop over kernels within the #pragma acc data region and so if I do not have the copy and copyin statements it needlessly copies unchanged data from the CPU every iteration.

I do not receive the “source file too large” error with this minimal example. I can tar the directory with the problem code and e-mail it to you, if that is possible. I am compiling the code with pgcc *.c -ta=nvidia,cc13 -Minfo.

Please send the code to PGI Customer Support (trs@pgroup.com) and ask them to forward it to me.

Thanks,
Mat

Thanks Mat, message sent.

In case it is of interest to others, the “source file too large” error was caused by accidentally having a repeated ‘local’ section in the #pragma acc data region directive.

Hi BeachHut,

FYI, TPR#16769 will be fixed in the 10.5 release.

  • Mat