PGI Fortran CUDA Internal compiler error. lili redefinition

I am getting a strange compiler error:

error F0000 : Internal compiler error. lili redefinition.      62

when compiling a very simple piece of code:

attributes(global) subroutine foo(a, n)
    implicit none
    
    integer,value :: n
    complex(8),device :: a(n)
    
    integer :: ix
    
    ix = (blockIdx%x-1)*blockDim%x + threadIdx%x
    
    if (ix <= n) then
        a(ix) = 0.0_8
    endif
    
end subroutine

I am targeting Generic x86-64 processors using the -tp=px-64 flag.
I have found this to be necessary for the program to run on a number of machines.

A number of tweaks will get rid of the error:

  • Change the array type to complex(4)
  • Change the array type to real(8) or real(4)
  • Make the array two dimensional
  • Pass in a second one dimensional array and set the elements of one equal to the elements of the other

Is there any way to get around this issue?

Hi bdforbes2,

Unfortunately I’m able to recreate the error you’re seeing. I’ve tried Linux, Windows, MacOSX using various compiler versions and flag sets, but still no luck.

Can you please let us know what compiler version, what platform, and which compiler options you’re using?

We did add better support for Fortran complex data types in our late 2015 releases so it’s possible that you’re encountering an issue that’s already been fix. Though until I have a reproducing example, I can’t tell for sure.

  • Mat

Hi mkcolg thanks for getting back to me. Sorry for leaving out those important details!

I’m on Windows 10 Pro 64-bit, using PGI Visual Fortran 16.5-0 64-bit. My compiler flags are:

-Mcuda
-tp=px-64
-fast

I’ve now discovered that it’s actually the -fast option in combination with -tp=px-64 which causes the problem. Hopefully that helps to narrow down the problem.

Great, thanks. I’m now able to recreate the error. I added a problem report, TPR#22705, and sent it off to engineering.

As a work around, use “-tp k8-64e”. This targets a AMD K8 revE 64-bit processor (~13 years old) so should be compatible with all the systems you’re using.

  • Mat

Thanks for your help. I’ll try that work around.

I don’t quite understand what it means to target a particular processor, or why that should be compatible with other systems. I would have thought that targeting the generic x86-64 processor should be the most general target. Do you know of any resources or references which could explain these issues?

I’m not sure it’s exactly what you’re looking for, but the -tp option is explained in section 2.3.63 of the PGI Compiler Reference Guide: PGI Documentation Archive for Versions Prior to 17.7

I would have thought that targeting the generic x86-64 processor should be the most general target.

It is but you encountered this issue and why I suggested moving to the next lowest common denominator. In general, processors are backward compatible so binaries targeting older architectures will run on newer architectures.

However in using “px-64” or “k8-64e” you will be missing out on using new features that could help improve the performance of your code. You may consider looking at using PGI’s Unified Binary feature which allows you to target multiple architectures in a single binary. At runtime, the optimal code path taken depending upon which architecture the code is being run.

To create a unified binary, add multiple target processors to the “-tp” flag. For example the following should cover most architectures:

-tp=k8-64e,piledriver,penryn,sandybridge,haswell

  • Mat