pgacclnk segmentation fault

Hi,

When I run pgacclnk on my system, it crashes with segmentation fault. Is this a known bug?

I am using PGI community edition 18.10 on CentOS7.6 (Cuda 10.0).

$ uname -r
3.10.0-957.1.3.el7.x86_64
$ pgacclnk
pgacclnk: execv failed: (null)
pgacclnk: child process exit status 1: (null)
$ ldd `which pgacclnk`
	linux-vdso.so.1 =>  (0x00007ffe88cfb000)
	libc.so.6 => /usr/lib64/libc.so.6 (0x00002acb5d59e000)
	/lib64/ld-linux-x86-64.so.2 (0x00002acb5d37a000)
$ pgfortran --version

pgfortran 18.10-1 64-bit target on x86-64 Linux -tp skylake 
PGI Compilers and Tools
Copyright (c) 2018, NVIDIA CORPORATION.  All rights reserved.

Regards,
Amiya.

pgacclnk is not meant to be run by users. It is invoked by our compiler drivers as part of the CUDA device code generation process.

Brent,

I was running pgacclnk as part of building a computational chemistry software. The previous message was intended to provide some diagnostic information. Can you speculate what is wrong below?

The exact error message I have is below. I have shortened/hidden some of the paths to make it cleaner.

pgfortran-Fatal-$PGINSTALL/bin/pgacclnk TERMINATED by signal 11
Arguments to $PGINSTALL/bin/pgacclnk
$PGINSTALL/bin/pgacclnk -nvidia $PGINSTALL/bin/pgnvd -cuda10000 -cudaroot /usr/local/cuda-10.0 -cudalink -computecap=30 -computecap=35 -computecap=50 -computecap=60 -computecap=70 -computecap=30 -computecap=35 -computecap=50 -computecap=60 -computecap=70 -computecap=30 -computecap=35 -computecap=50 -computecap=60 -computecap=70 -computecap=30 -computecap=35 -computecap=50 -computecap=60 -computecap=70 -init=cuda /usr/bin/ld /usr/lib64/crt1.o /usr/lib64/crti.o $PGINSTALL/lib/trace_init.o /usr/lib/gcc/x86_64-redhat-linux/4.8.5/crtbegin.o $PGINSTALL/lib/initmp.o $PGINSTALL/lib/f90main.o --eh-frame-hdr -m elf_x86_64 -dynamic-linker /lib64/ld-linux-x86-64.so.2 $PGINSTALL/lib/pgi.ld -L$PGINSTALL/lib -L/usr/lib64 -L/usr/lib/gcc/x86_64-redhat-linux/4.8.5 -lcublas -lpthread -lm -lc (((...object files...))) -rpath $PGINSTALL/lib -rpath /usr/local/cuda-10.0/lib64 -rpath /usr/lib/gcc/x86_64-redhat-linux/4.8.5/../../../../lib64 -o l701.exel -L/usr/lib/gcc/x86_64-redhat-linux/4.8.5/../../../../lib64 -Bstatic -Bdynamic -Bstatic -lcudafor91 -lcudafor -lcudaforblas -Bdynamic $PGINSTALL/lib/acc_init_link_cuda.o -Bstatic -laccapimp -laccncmp -laccnmp -laccgmp -laccncmp -laccnmp -laccg2mp -Bdynamic -L/usr/local/cuda-10.0/lib64 -Bstatic -lcudart_static -Bdynamic -ldl -ldl -Bstatic -lcudadevice -Bdynamic -Bstatic -lcudafor2 -Bdynamic -Bstatic -lpgf90rtl -lpgf90 -lpgf90_rpm1 -lpgf902 -lpgf90rtl -lpgftnrtl $PGINSTALL/lib/nonuma.o -lpgmp -Bdynamic -Bstatic -Bdynamic -lpthread -Bstatic --start-group -lpgmath -lnspgc -lpgc --end-group -Bdynamic -lrt -lpthread -lm -lgcc -lc -lgcc -lgcc_s /usr/lib/gcc/x86_64-redhat-linux/4.8.5/crtend.o /usr/lib64/crtn.o

Our PGI 18.10 shipped with CUDA 10.0, but it looks like you are somehow pointing to the CUDA 10 installed in /usr/local. Do you have something in your environment which is doing that? I’m not 100% sure that is the problem, but it is odd.

I have Cuda 10.0.130 installed in /usr/local and CUDA_HOME defined in my environment. The software is probably configuring it from CUDA_HOME.

Do you suggest unsetting CUDA_HOME?

Regards,

I would certainly try that. If it works, we should try to understand what causes the failure, because CUDA_HOME should be supported, but it is possible there is some mismatch between the CUDA version we ship and the one you’ve installed.