ICE with OpenMP map clause

christian.weiss · June 13, 2024, 1:55pm

Hi,

I’m encountering an ICE with the latest HPC SDK for a simple Saxpy example:

int main (int argc, char *argv[]) {
  size_t N = 1024*1024;
  double k = 1.2345;

  double *a = (double*)malloc(N*sizeof(double));
  double *b = (double*)malloc(N*sizeof(double));
  double *c = (double*)malloc(N*sizeof(double));

  for (size_t i = 0; i < N; i++) {
     a[i] = 0.5 * (double)i;
     b[i] = 0.75 * (double)i;
  }

#pragma omp target map (to: a[0:N], b[0:N])
#pragma omp target teams distribute parallel for
  for (size_t i = 0; i < N; i++) {
     c[i] = k * a[i] + b[i];
  }
}

(I am not entirely sure if that combination of pragmas is valid. Maybe it’s not and that’s why it hasn’t been found so fast?)

Compiling with -mp=gpu yields

NVC++-F-0000-Internal compiler error. child tinfo should have been created at outlining function for host    1327  (test2.c: 16)
NVC++/x86-64 Linux 24.5-1: compilation aborted

The program compiles fine when I add enter data to the map directive.

The compiler version that I use is 24.5-1, but I have observed the ICE with 24.3 and 23.5, too.

Regards,
Christian

MatColgrove · June 13, 2024, 5:26pm

Hi Christian,

The problem here is with nested target compute regions. I assume you meant to have the outer target be a data region, i.e. “#pragma omp target data map (to: a[0:N], b[0:N])”.

I can’t find anything in the standard that states if compute regions can be nested which would make the behavior undefined. Granted, I could be missing it, but even if it were allowed, this is saying to launch a compute kernel from within another compute kernel (aka dynamic parallelism).

OpenACC specifically allows dynamic parallelism, but we chose not to support it given other than a few toy examples, we’ve not been able to find any real use cases for it.

Was it your intent to use dynamic parallelism?

Now the compiler shouldn’t ICE, so I’ve added a problem report, TPR#35881, and have asked engineering to detect this error.

Besides adding the “data” clause to the outer directive, you can instead remove the inner “target” to get it to work:

#pragma omp target map (to: a[0:N], b[0:N])
#pragma omp teams distribute parallel for
  for (size_t i = 0; i < N; i++) {

-Mat

christian.weiss · June 17, 2024, 11:36am

Hi Mat,

thank you for your elaborate answer and for reporting the issue. There’s no specific intent behind this code. I was just playing around with different combinations of pragmas when I encountered that ICE.

I agree with your understanding that the map clause alone is linked to a specific parallel region. The target data and target enter/exit data are directives on their own and therefore it works with them.

Regards,
Christian

Topic		Replies	Views
OpenMP target region inside parallel region causes internal compiler error nvc, nvc++ and nvfortran	1	442	April 26, 2023
Improving compiler error with OpenACC + OpenMP: "Internal compiler error. confused OMP private processing" nvc, nvc++ and nvfortran	1	438	October 18, 2021
NVC++-F-0000-Internal compiler error. unhandled size for preparing max constant nvc, nvc++ and nvfortran	2	569	July 25, 2022
NVC++ ICE with OpenMP target traversing random access iterators nvc, nvc++ and nvfortran	2	655	March 31, 2023
Target reduction in parallel region : Internal compiler error. unexpected ILM number operands nvc, nvc++ and nvfortran	1	14	February 25, 2025
Internal compiler error. child tinfo should have been created at outlining function for host nvc, nvc++ and nvfortran	1	319	November 22, 2023
Nvc++ OpenMP error inside llc nvc, nvc++ and nvfortran	5	1123	June 1, 2021
ICE: Stack error while lowering nvc, nvc++ and nvfortran	2	203	May 13, 2024
Problem with the nvc++ compiler for OpeMP GPU offloading nvc, nvc++ and nvfortran	2	574	March 10, 2023
Nvc/21.7 regression: Internal compiler error. Can only coerce indirect args nvc, nvc++ and nvfortran	4	678	September 28, 2021

ICE with OpenMP map clause

Related topics