Hi Mat,
The CUDA version is 8.0.44. I can see about getting the latest version if that would help. I am using Linux with an Nvidia V100 PCIe and the following CPU:
vendor id : GenuineIntel
model name : Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz
cpu family : 6
model : 63
name : Haswell
stepping : 2
processors : 12
threads : 2
clflush size : 8
L2 cache size : 256KB
L3 cache size : 15360KB
Here is the output using verbose mode:
Export PGI_CURR_CUDA_HOME=/ihome/crc/install/pgi/18.10/linux86-64/2018/cuda/9.1
Export PGI=/ihome/crc/install/pgi/18.10
/ihome/crc/install/pgi/18.10/linux86-64/18.10/bin/pgf901 trand2.cuf -opt 1 -nohpf -nostatic -x 19 0x400000 -quad -x 59 4 -x 15 2 -x 49 0x400004 -x 51 0x20 -x 57 0x4c -x 58 0x10000 -x 124 0x1000 -tp haswell -x 57 0xfb0000 -x 58 0x78031040 -x 47 0x08 -x 48 4608 -x 49 0x100 -x 120 0x200 -stdinc /ihome/crc/install/pgi/18.10/linux86-64/18.10/include-gcc48:/ihome/crc/install/pgi/18.10/linux86-64/18.10/include:/ihome/crc/install/python/miniconda3-3.7/include/python3.7m:/usr/lib/gcc/x86_64-redhat-linux/4.8.5/include:/usr/local/include:/usr/include -cmdline '+pgfortran trand2.cuf -Mcuda=nollvm -Mcudalib=curand -v' -def unix -def __unix -def __unix__ -def linux -def __linux -def __linux__ -def __NO_MATH_INLINES -def __LP64__ -def __x86_64 -def __x86_64__ -def __LONG_MAX__=9223372036854775807L -def '__SIZE_TYPE__=unsigned long int' -def '__PTRDIFF_TYPE__=long int' -def __extension__= -def __amd_64__amd64__ -def __k8 -def __k8__ -def __SSE__ -def __MMX__ -def __SSE2__ -def __SSE3__ -def __SSSE3__ -def _CUDA -def _CUDA -def __CUDA_API_VERSION=9010 -freeform -x 137 1 -x 121 0xc00 -x 180 0x4000000 -cudaver 9010 -vect 48 -x 54 1 -def __CUDA_API_VERSION=9010 -cudaver 9.1 -x 70 0x40000000 -x 189 0x8000 -y 163 0xc0000000 -x 137 1 -modexport /tmp/pgfortranAMIcooVrk5IC.cmod -modindex /tmp/pgfortranQMIc_-O0fSvz.cmdx -output /tmp/pgfortran6MIcUufHtTGQ.ilm
0 inform, 0 warnings, 0 severes, 0 fatal for mtests
0 inform, 0 warnings, 0 severes, 0 fatal for testany
0 inform, 0 warnings, 0 severes, 0 fatal for t
PGF90/x86-64 Linux 18.10-0: compilation successful
/ihome/crc/install/pgi/18.10/linux86-64/18.10/bin/pgf902 /tmp/pgfortran6MIcUufHtTGQ.ilm -fn trand2.cuf -opt 1 -x 51 0x20 -x 119 0xa10000 -x 122 0x40 -x 123 0x1000 -x 127 4 -x 127 17 -x 19 0x400000 -x 28 0x40000 -x 120 0x10000000 -x 70 0x8000 -x 122 1 -x 125 0x20000 -quad -x 59 4 -tp haswell -x 120 0x1000 -x 124 0x1400 -y 15 2 -x 57 0x3b0000 -x 58 0x48000000 -x 49 0x100 -x 120 0x200 -astype 0 -x 137 1 -x 121 0xc00 -x 180 0x4000000 -cudaver 9010 -x 68 0x20 -x 176 0x100 -cudacap 35 -cudacap 50 -cudacap 60 -cudacap 70 -cudaver 9010 -x 70 0x40000000 -x 164 0x800000 -x 124 1 -x 189 0x10 -x 189 0x8000 -y 163 0xc0000000 -y 189 0x4000000 -cudaroot /ihome/crc/install/pgi/18.10/linux86-64/2018/cuda/9.1 -x 137 1 -x 121 0xc00 -x 180 0x4000000 -x 176 0x100 -cudacap 35 -cudacap 50 -cudacap 60 -cudacap 70 -cudaver 9010 -cmdline '+pgfortran trand2.cuf -Mcuda=nollvm -Mcudalib=curand -v' -asm /tmp/pgfortran6MIcU-WwwL97.s
0 inform, 0 warnings, 0 severes, 0 fatal for mtests
0 inform, 0 warnings, 0 severes, 0 fatal for testany
0 inform, 0 warnings, 0 severes, 0 fatal for t
/ihome/crc/install/pgi/18.10/linux86-64/18.10/bin/pgnvd -dcuda /ihome/crc/install/pgi/18.10/linux86-64/2018/cuda/9.1 -usenvvm -reloc /tmp/pgcudafor5OIcRlB-bJb5.gpu -computecap=35 -ptx /tmp/pgcudaforjOIcBXzAru9L.ptx -o /tmp/pgcudaforrOIcZ8h2z2hV.bin -cuda9010
/ihome/crc/install/pgi/18.10/linux86-64/18.10/bin/pgnvd -dcuda /ihome/crc/install/pgi/18.10/linux86-64/2018/cuda/9.1 -usenvvm -reloc /tmp/pgcudafor5OIcRlB-bJb5.gpu -computecap=50 -ptx /tmp/pgcudaforXOIctO-g54Uz.ptx -o /tmp/pgcudafor5OIcRQCBbag9.bin -cuda9010
/ihome/crc/install/pgi/18.10/linux86-64/18.10/bin/pgnvd -dcuda /ihome/crc/install/pgi/18.10/linux86-64/2018/cuda/9.1 -usenvvm -reloc /tmp/pgcudafor5OIcRlB-bJb5.gpu -computecap=60 -ptx /tmp/pgcudaforrOIcZdcJzl2L.ptx -o /tmp/pgcudaforzOIclIYNHMhA.bin -cuda9010
/ihome/crc/install/pgi/18.10/linux86-64/18.10/bin/pgnvd -dcuda /ihome/crc/install/pgi/18.10/linux86-64/2018/cuda/9.1 -usenvvm -reloc /tmp/pgcudafor5OIcRlB-bJb5.gpu -computecap=70 -ptx /tmp/pgcudaforXOIctph45TfD.ptx -o /tmp/pgcudafor5OIcRjH1bDt7.bin -cuda9010
/ihome/crc/install/pgi/18.10/linux86-64/18.10/bin/pgnvd -dcuda /ihome/crc/install/pgi/18.10/linux86-64/2018/cuda/9.1 -reloc -cuda9010 -fat trand2.cuf -sm 35 /tmp/pgcudaforrOIcZ8h2z2hV.bin -sm 50 /tmp/pgcudafor5OIcRQCBbag9.bin -sm 60 /tmp/pgcudaforzOIclIYNHMhA.bin -sm 70 /tmp/pgcudafor5OIcRjH1bDt7.bin -compute 70 /tmp/pgcudaforXOIctph45TfD.ptx -o /tmp/pgaccbOIcdxkc1fp3.fat
nvlink error : Undefined reference to '__pgicudalib_curandNormalXORWOW' in '/tmp/pgfortranAMIcoPaoH9U5.o'
nvlink error : Undefined reference to '__pgicudalib_curandUniformXORWOW' in '/tmp/pgfortranAMIcoPaoH9U5.o'
nvlink error : Undefined reference to '__pgicudalib_curandInitXORWOW' in '/tmp/pgfortranAMIcoPaoH9U5.o'
pgacclnk: child process exit status 2: /ihome/crc/install/pgi/18.10/linux86-64/18.10/bin/pgnvd
pgfortran-Fatal-linker completed with exit code 2
Unlinking /tmp/pgfortran6MIcUufHtTGQ.ilm
Unlinking /tmp/pgfortrankMIcE6VUp8G3.stb
Unlinking /tmp/pgfortranAMIcooVrk5IC.cmod
Unlinking /tmp/pgfortranQMIc_-O0fSvz.cmdx
Unlinking /tmp/pgfortran6MIcU-WwwL97.s
Unlinking /tmp/pgfortrankMIcE94XoIfP.ll
Unlinking /tmp/pgfortranAMIcoPaoH9U5.o
One thing that may be important to note is that the GPU is part of a cluster managed by SLURM.
Thanks,
Ben