PGI/LLVM 18.4: -S switch does not produce assember output

While testing the first PGI/LLVM community edition, I stumbled upon the fact, that the -S switch does not produce the expected assembler output. Instead it produces LLVM IR output. IMHO this is not acceptable.

Best,

Hi Bert,

Apologies for my confusion, but the PGI LLVM compilers do produce assembly code when using the “-S” flag (or -Mkeepsam). Can you please give more detail why you think it produces LLVM IR? Maybe there’s an issue with a particular system or flag combination?

On Skylake systems we do use “llvm-mc” as the assembler instead of “as”, but “as” should be able assemble the file. On older x86 and Power systems, “as” will be used instead of “llvm-mc”.

For example:

% pgfortran -V

pgfortran 18.4-0 LLVM 64-bit target on x86-64 Linux -tp skylake
PGI Compilers and Tools
Copyright (c) 2018, NVIDIA CORPORATION.  All rights reserved.
% ls grid.o grid.s
ls: cannot access grid.o: No such file or directory
ls: cannot access grid.s: No such file or directory
% pgfortran -S grid.F90
% ls grid.s
grid.s
% as grid.s -o grid.o
% ls grid.o
grid.o
% cat grid.s
        .text
        .file   "grid.ll"
        .globl  MAIN_                   # -- Begin function MAIN_
        .p2align        4, 0x90
        .type   MAIN_,@function
MAIN_:                                  # @MAIN_
.Lfunc_begin0:
        .file   1 "grid.F90"
        .loc    1 2 0                   # grid.F90:2:0
        .cfi_startproc
# BB#0:                                 # %L.entry
        .loc    1 2 1 prologue_end      # grid.F90:2:1
        pushq   %rbp
.Lcfi0:
        .cfi_def_cfa_offset 16
        pushq   %r14
.Lcfi1:
        .cfi_def_cfa_offset 24
        pushq   %rbx
.Lcfi2:
        .cfi_def_cfa_offset 32
        subq    $576, %rsp              # imm = 0x240
.Lcfi3:
        .cfi_def_cfa_offset 608
.Lcfi4:
        .cfi_offset %rbx, -32
.Lcfi5:
        .cfi_offset %r14, -24
.Lcfi6:
        .cfi_offset %rbp, -16
        movl    $.C283_MAIN_, %eax
        movl    %eax, %edi
        xorl    %eax, %eax
        movb    %al, %cl
        movb    %cl, %al
        callq   pghpf_init
        movq    $0, 568(%rsp)
        movq    $0, 440(%rsp)
        jmp     .LBB0_1
... continues...

Let’s now look at the verbose output (-v) from the compilation. We’ll use “-Mkeepasm” instead of “-S” so the compilation continue past the generation of the assembly file.

Since this is a Skylake system, llvm-mc is used.

% pgfortran grid.F90 -v -c -Mkeepllvm -Mkeepasm
Export PGI=/proj/pgi

/proj/pgi/linux86-64-llvm/18.4/bin/pgf901-llvm grid.F90 -opt 1 -nohpf -nostatic -x 19 0x400000 -quad -x 59 4 -x 15 2 -x 49 0x400004 -x 51 0x20 -x 57 0x4c -x 58 0x10000 -x 124 0x1000 -tp skylake -x 57 0xfb0000 -x 58 0x78031040 -x 47 0x08 -x 48 4608 -x 49 0x100 -stdinc /proj/pgi/linux86-64-llvm/18.4/include-gcc48:/proj/pgi/linux86-64-llvm/18.4/include:/usr/lib/gcc/x86_64-redhat-linux/4.8.5/include:/usr/local/include:/usr/include -cmdline '+pgfortran grid.F90 -v -c -Mkeepllvm -Mkeepasm' -def unix -def __unix -def __unix__ -def linux -def __linux -def __linux__ -def __NO_MATH_INLINES -def __LP64__ -def __x86_64 -def __x86_64__ -def __LONG_MAX__=9223372036854775807L -def '__SIZE_TYPE__=unsigned long int' -def '__PTRDIFF_TYPE__=long int' -def __extension__= -def __amd_64__amd64__ -def __k8 -def __k8__ -def __SSE__ -def __MMX__ -def __SSE2__ -def __SSE3__ -def __SSSE3__ -def __STDC_HOSTED__ -def __PGLLVM__ -def __extension__= -preprocess -freeform -vect 48 -y 54 1 -x 70 0x40000000 -x 70 0x40000000 -x 68 0x1 -y 163 0xc0000000 -x 189 0x10 -stbfile /tmp/pgfortran1NCkFqzkLX2L.stb -modexport /tmp/pgfortranDNCkxudmh90Q.cmod -modindex /tmp/pgfortranfNCkp39oYLKJ.cmdx -output /tmp/pgfortrannNCkNZItjAv3.ilm
PGF90-I-0922-Redundant definition for symbol __extension__ (grid.F90: -1)
  0 inform,   0 warnings,   0 severes, 0 fatal for test
PGF90/x86-64 Linux 18.4-0: compilation completed with informational messages

/proj/pgi/linux86-64-llvm/18.4/bin/pgf902-llvm /tmp/pgfortrannNCkNZItjAv3.ilm -fn grid.F90 -opt 1 -x 51 0x20 -x 119 0xa10000 -x 122 0x40 -x 123 0x1000 -x 127 4 -x 127 17 -x 19 0x400000 -x 28 0x40000 -x 120 0x10000000 -x 70 0x8000 -x 122 1 -x 125 0x20000 -quad -x 59 4 -tp skylake -x 120 0x1000 -x 124 0x1400 -y 15 2 -x 57 0x3b0000 -x 58 0x48000000 -x 49 0x100 -astype 0 -x 183 4 -x 121 0x800 -x 54 0x10 -x 70 0x40000000 -x 249 50 -x 70 0x40000000 -x 164 0x800000 -x 124 1 -x 68 0x1 -y 163 0xc0000000 -x 189 0x10 -y 189 0x4000000 -x 160 2 -cmdline '+pgfortran grid.F90 -v -c -Mkeepllvm -Mkeepasm' -stbfile /tmp/pgfortran1NCkFqzkLX2L.stb -asm grid.ll
  0 inform,   0 warnings,   0 severes, 0 fatal for test
PGF90/x86-64 Linux 18.4-0: compilation successful

/proj/pgi/linux86-64-llvm/18.4/share/llvm/bin/llc grid.ll -march=x86-64 -mcpu=native -O0 -fast-isel=0 -x86-cmov-converter=0 -o grid.s

/proj/pgi/linux86-64-llvm/18.4/share/llvm/bin/llvm-mc -filetype=obj -mcpu=skylake-avx512 grid.s -o grid.o

/proj/pgi/linux86-64-llvm/18.4/bin/pgappend -noerror grid.o -name .IPDINFO /tmp/pgfortranDNCkxudmh90Q.cmod -name .IPEINFO /tmp/pgfortranfNCkp39oYLKJ.cmdx
Unlinking /tmp/pgfortrannNCkNZItjAv3.ilm
Unlinking /tmp/pgfortran1NCkFqzkLX2L.stb
Unlinking /tmp/pgfortranDNCkxudmh90Q.cmod
Unlinking /tmp/pgfortranfNCkp39oYLKJ.cmdx

Moving to a Haswell system, we can see that “as” is now used:

% pgfortran -v -c -Mkeepasm grid.F90
Export PGI=/proj/pgi

/proj/pgi/linux86-64-llvm/18.4/bin/pgf901-llvm grid.F90 -opt 1 -nohpf -nostatic -x 19 0x400000 -quad -x 59 4 -x 15 2 -x 49 0x400004 -x 51 0x20 -x 57 0x4c -x 58 0x10000 -x 124 0x1000 -tp haswell -x 57 0xfb0000 -x 58 0x78031040 -x 47 0x08 -x 48 4608 -x 49 0x100 -stdinc /proj/pgi/linux86-64-llvm/18.4/include-gcc48:/proj/pgi/linux86-64-llvm/18.4/include:/usr/lib/gcc/x86_64-redhat-linux/4.8.5/include:/usr/local/include:/usr/include -cmdline '+pgfortran grid.F90 -v -c -Mkeepasm' -def unix -def __unix -def __unix__ -def linux -def __linux -def __linux__ -def __NO_MATH_INLINES -def __LP64__ -def __x86_64 -def __x86_64__ -def __LONG_MAX__=9223372036854775807L -def '__SIZE_TYPE__=unsigned long int' -def '__PTRDIFF_TYPE__=long int' -def __extension__= -def __amd_64__amd64__ -def __k8 -def __k8__ -def __SSE__ -def __MMX__ -def __SSE2__ -def __SSE3__ -def __SSSE3__ -def __STDC_HOSTED__ -def __PGLLVM__ -def __extension__= -preprocess -freeform -vect 48 -y 54 1 -x 70 0x40000000 -x 70 0x40000000 -x 68 0x1 -y 163 0xc0000000 -x 189 0x10 -stbfile /tmp/pgfortran8qul09LcH0YN.stb -modexport /tmp/pgfortranCquluSvf7jcw.cmod -modindex /tmp/pgfortran8qul0odoCHku.cmdx -output /tmp/pgfortranCquluJz5DKpI.ilm
PGF90-I-0922-Redundant definition for symbol __extension__ (grid.F90: -1)
  0 inform,   0 warnings,   0 severes, 0 fatal for test
PGF90/x86-64 Linux 18.4-0: compilation completed with informational messages

/proj/pgi/linux86-64-llvm/18.4/bin/pgf902-llvm /tmp/pgfortranCquluJz5DKpI.ilm -fn grid.F90 -opt 1 -x 51 0x20 -x 119 0xa10000 -x 122 0x40 -x 123 0x1000 -x 127 4 -x 127 17 -x 19 0x400000 -x 28 0x40000 -x 120 0x10000000 -x 70 0x8000 -x 122 1 -x 125 0x20000 -quad -x 59 4 -tp haswell -x 120 0x1000 -x 124 0x1400 -y 15 2 -x 57 0x3b0000 -x 58 0x48000000 -x 49 0x100 -astype 0 -x 183 4 -x 121 0x800 -x 54 0x10 -x 70 0x40000000 -x 249 50 -x 70 0x40000000 -x 164 0x800000 -x 124 1 -x 68 0x1 -y 163 0xc0000000 -x 189 0x10 -y 189 0x4000000 -x 160 2 -cmdline '+pgfortran grid.F90 -v -c -Mkeepasm' -stbfile /tmp/pgfortran8qul09LcH0YN.stb -asm /tmp/pgfortranCqulufDmcBu5.ll
  0 inform,   0 warnings,   0 severes, 0 fatal for test
PGF90/x86-64 Linux 18.4-0: compilation successful

/proj/pgi/linux86-64-llvm/18.4/share/llvm/bin/llc /tmp/pgfortranCqulufDmcBu5.ll -march=x86-64 -mcpu=native -O0 -fast-isel=0 -x86-cmov-converter=0 -o grid.s

/usr/local/bin/as grid.s -o grid.o

/proj/pgi/linux86-64-llvm/18.4/bin/pgappend -noerror grid.o -name .IPDINFO /tmp/pgfortranCquluSvf7jcw.cmod -name .IPEINFO /tmp/pgfortran8qul0odoCHku.cmdx
Unlinking /tmp/pgfortranCquluJz5DKpI.ilm
Unlinking /tmp/pgfortran8qul09LcH0YN.stb
Unlinking /tmp/pgfortranCquluSvf7jcw.cmod
Unlinking /tmp/pgfortran8qul0odoCHku.cmdx
Unlinking /tmp/pgfortranCqulufDmcBu5.ll
Unlinking /tmp/pgfortran8qul0VfUMyse.llvm

Any help in understanding why you think the generated “.s” file is not assembly would be appreciated!

Best Regards,
Mat

Yes, I can understand your confusion. As I was. But I could figure out the differences:

$ pgfortran -V
pgfortran 18.4-0 LLVM 64-bit target on x86-64 Linux -tp haswell 
PGI Compilers and Tools
Copyright (c) 2018, NVIDIA CORPORATION.  All rights reserved.
$ ls test-save.s test-save.o
ls: cannot access test-save.s: No such file or directory
ls: cannot access test-save.o: No such file or directory
$ pgfortran -S test-save.f90
$ head test-save.s 
	.text
	.file	"/tmp/pgfortran0-mdCyiHLVqP.ll"
	.globl	foo_                    # -- Begin function foo_
	.p2align	4, 0x90
	.type	foo_,@function
foo_:                                   # @foo_
.Lfunc_begin0:
	.file	1 "test-save.f90"
	.loc	1 1 0                   # test-save.f90:1:0
	.cfi_startproc
$ rm test-save.s
$ pgfortran -S test-save.f90 -o test-save.s
$ head test-save.s 
; ModuleID = 'test-save.f90'
target datalayout = "e-p:64:64-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-pc-linux-gnu"
define void @foo_() noinline !dbg !14 {
L.entry:

	br label %L.LB1_300
L.LB1_300:
	ret void, !dbg !18
}

So the problem seems to be the combination of -S with -o. Can you confirm this?

Great, thanks! That’s the difference.

I’ve filled an issue report, TPR#25919, and sent it on to our compiler engineers for further investigation.

Thanks again for bring this to our attention.

-Mat