Invalid result of the reduction with teams distribute in a parallel region

mickaelg · February 26, 2025, 9:36am

Hi,

The nvc++ compiler produces an invalid result when a reduction is performed using “target teams distribute parallel for” in a parallel region.

The problem occurs even if the environnment variable “OMP_NUM_THREADS=1”

The simple code is:

#include <stdio.h>

int main (void) {
    #pragma omp parallel
    {
        int sum = 0;
       // #pragma omp parallel for reduction(+:sum) // => OK on cpu
      // #pragma omp target parallel for reduction(+:sum) // => OK
       #pragma omp target teams distribute parallel for reduction(+:sum) // => WRONG
       for(int i = 0 ; i < 20000; i++) {
          sum += i;
        }
        printf("sum2 = %d\n",sum);
    }
  return 0;
}

OMP_NUM_THREADS=1 ./essai
gives

sum2 = 19232

instead of

sum2 = 199990000

If I comment the first line “#pragma omp parallel” , the result is correct (sum2 = 199990000).

The result is correct if the reduction is used using “target parallel for reduction”.

I use the command: “nvc++ -mp=gpu -O3 essai.c -o essai”

“nvc++ --version” returns
"nvc++ 25.1-0 64-bit target on x86-64 Linux -tp cascadelake "

Mickaêl

MatColgrove · February 26, 2025, 5:18pm

Hi Mickaêl,

Sorry, I should have mentioned this when I suggested this as a work around for the ICE with the loop version.

There’s a core issue with using target offload with a outer host parallel having to do with private variables. These are specific example of the larger issue. I’ve already noted the wrong answers in the early report.

-Mat

Topic		Replies	Views
Target reduction in parallel region : Internal compiler error. unexpected ILM number operands nvc, nvc++ and nvfortran	1	73	February 25, 2025
Out of range error with openmp gpu offload nvc, nvc++ and nvfortran	10	1156	February 1, 2023
Nvc++; openmp; "distributed" is considered as invalid text in pragma and "parallel" results in error nvc, nvc++ and nvfortran	2	721	November 17, 2022
CUDA_ERROR_ILLEGAL_ADDRESS with OpenMP "distribute parallel for" nvc, nvc++ and nvfortran	2	320	May 15, 2024
Nvfortran: Internal compiler error in openmp reduction nvc, nvc++ and nvfortran	2	423	November 17, 2023
[OpenMP][nvc++] "Duplicate name in reduction clause" error with recent SDK nvc, nvc++ and nvfortran	5	613	December 6, 2023
Internal Compiler error in nvfortran with OMP-GPU offloading nvc, nvc++ and nvfortran hpc	2	754	May 17, 2022
Converting OpenMP from multicore to GPU question nvc, nvc++ and nvfortran	7	927	July 9, 2021
OpenMP 1D reduction leads to internal compiler error nvc, nvc++ and nvfortran	1	46	February 27, 2026
Nvc: omp parallel for in declare target subroutine does not work nvc, nvc++ and nvfortran	8	184	June 18, 2025

Invalid result of the reduction with teams distribute in a parallel region

Related topics