-ta=multicore failed to compile with llvm code generation

oukore · July 23, 2019, 12:10am

Also looking for workarounds that still use llvm code generation.

/**
$ pgc++ -v
Export PGI_CURR_CUDA_HOME=/opt/pgi/linux86-64-llvm/2019/cuda/
Export PGI=/opt/pgi
pgc++-Warning-No files to process


$ pgc++ -c bug.cc -ta=multicore
/opt/pgi/linux86-64-llvm/19.4/share/llvm/bin/opt: /tmp/pgc++lElHCF-OeKc.ll:106:23: error: use of undefined value '%n2.addr'
        %22 = load i32, i32* %n2.addr, align 4, !tbaa !33, !dbg !44
                             ^
 */

void func(float* b, unsigned n) {
  unsigned n2 = n;
  #pragma acc parallel loop independent private(b[0:n2])
  for (int i = 0; i < n; ++i) {}
}

MatColgrove · July 23, 2019, 5:09pm

Hi stw,

Thanks for the report. I added TPR#27435 to track this issue.

The problem here is that since “n2” isn’t used anywhere else except as the loop bounds to a private clause, the reference is getting deleted. I’ve seen similar issues in the past but it looks like we missed this case. The same code works when targeting Tesla or when used in a copy clause.

The work around would be to use “n” in place of “n2”, or reference “n2” someplace in the body of the loop. Something like:

void func(float* b, unsigned n) {
  unsigned n2 = n;
  #pragma acc parallel loop independent private(b[0:n2])
  for (int i = 0; i < n; ++i) {
    for (int j = 0; j < n2; ++j) {
    }
  }
}

-Mat

oukore · July 23, 2019, 8:35pm

Thank you Mat.

The first post was just a minimal working example. The original code looked like

unsigned nsq = n * n;
// ... malloc() using nsq
#pragma acc parallel loop independent private(b[0:nsq])
for (int i = 0; i < n; ++i) {
// ...
}

Thanks to your reply, I think the problem can be avoided simply by

#pragma acc parallel loop independent private(b[0:n*n])

Topic		Replies	Views
prevent parallelization Legacy PGI Compilers	3	1969	March 22, 2012
Need advice for OpenACC directives Legacy PGI Compilers	6	7385	July 5, 2016
No Multicore Core generated: why? Legacy PGI Compilers	3	1614	March 21, 2018
pgc++ -c -acc failed to compile with -O2 Legacy PGI Compilers	2	2588	August 26, 2019
Shared memory is not correctly used in kernels block nvc, nvc++ and nvfortran	13	683	June 15, 2022
paralellize some loops with omp + acc Legacy PGI Compilers	16	6999	March 19, 2018
Partial result consistency after OpenACC parallelization Legacy PGI Compilers	1	624	August 7, 2023
Can' compile for ta=multicore but ta=host works Legacy PGI Compilers	4	3873	January 6, 2020
dependence in loop prevents parallelization Legacy PGI Compilers	3	8820	February 9, 2010
"error: this kind of pragma may not be used here" Why? Legacy PGI Compilers	4	1289	April 1, 2022

-ta=multicore failed to compile with llvm code generation

Related topics