I get the following error when compiling some OpenACC code using v12.5.
PGC-W-0155-Compiler failed to translate accelerator region (see -Minfo messages): Kernel argument list is > 256 bytes, the max supported by CUDA (le_core.c: 286)
The code originally used function calls but the compiler didn’t like this so I inlined all the function calls (by copy + paste) and I suspect I now have just too many variables hence overflowing the arguement list.
So, a few questions for the panel:
Is too many variables likely the cause of this error?
Are there any general strategies for avoiding this? What about stuffing scalars into a struct/array for example?
Are we likely to see support for the compiler inlining function calls?
I’d post the code here but it’s a bit of a monster so I can email it in if necessary.
This is a known problem (TPR#18752) that first started up in 12.5. Normally we have a work-around for NVIDIA’s 256 byte argument limit where the arguments are wrapped up in a struct, the struct copied to the device, and then only a pointer to the struct is passed as an argument. For some reason in 12.5, this workaround didn’t kick in for all cases.
In reading the notes in TPR#18752, it doesn’t look like we’ll have fix in place for the next release, but it should be in the one after.
I’d post the code here but it’s a bit of a monster so I can email it in if necessary.
Unless it’s a real pain to send, having a second example of the problem doesn’t hurt. If anything, we can then make sure that we’ve fixed it in you particular case and notify you directly once the fix is in place. I actually found the original issue, so yours is the first known external report.
Are there any general strategies for avoiding this? What about stuffing scalars into a struct/array for example?
An array might work, but since this is temporary issue, I wouldn’t change your program too much especially if 12.4 works for you.
Are we likely to see support for the compiler inlining function calls?
The compiler does support inlining function calls. There are cases where it can’t, but I’d need to know more specifics about your case to tell why the function is not inlining and if it’s even possible.