I have run the code before without issues , but when I added additional flags for profiling I get this error
pgfortran -Mcuda -Minfo -ta=nvidia -c precision_m.F90
pgfortran -Mcuda -Minfo -ta=nvidia -c cpurandom_m.F90
cpp -DGLOBAL host_gen_m.CUF > host_gen_m1.CUF
pgfortran -Mcuda -Minfo -ta=nvidia -c host_gen_m1.CUF
cpp -DGLOBAL host_subs_m.CUF > host_subs_m1.CUF
pgfortran -Mcuda -Minfo -ta=nvidia -c host_subs_m1.CUF
Stack dump:
0. Running pass 'Simplify Live Out' on function '@host_subs_m_d_local_energy_'
pgnvd-Fatal-/state/partition1/pgi14/linux86-64/2014/cuda/6.0/nvvm/bin/cicc TERMINATED by signal 11
Arguments to /state/partition1/pgi14/linux86-64/2014/cuda/6.0/nvvm/bin/cicc
/state/partition1/pgi14/linux86-64/2014/cuda/6.0/nvvm/bin/cicc -arch compute_20 -m64 -ftz=0 -prec_div=1 -prec_sqrt=1 -fmad=1 /tmp/pgnvdsKkd2_iVgVe1.i -o /tmp/pgcudaforSJkde3FtmLFR.ptx -nvvmir-library /state/partition1/pgi14/linux86-64/2014/cuda/6.0/nvvm/libdevice/libdevice.compute_20.10.bc
PGF90-F-0155-Compiler failed to translate accelerator region (see -Minfo messages): Device compiler exited with error status code (host_subs_m1.CUF: 1)
PGF90/x86-64 Linux 14.10-0: compilation aborted
make: *** [host_subs_m1.o] Error 2
The problem seems to be coming from the Local Energy subroutine,
here are some of the necessary modules in case you might want to reproduce the error
https://dl.dropboxusercontent.com/u/59996494/host_subs_m1%20(Fannelia's%20conflicted%20copy%202015-06-22).CUF
https://dl.dropboxusercontent.com/u/59996494/host_gen_m1.CUF
[/code]
This seems to be a bug from the compiler which happens after the optimization of the code , because the code compiles correctly at -O0 and -O1.
Hi egodfred,
I haven’t been able to reproduce your error. Instead get a different error with 14.7 and successful compilation with 14.9.
Which version are you using?
Can you post the source for “cpurandom_m.F90”? I had to comment it out in order to get the source to compile. Also, I’m using a “percision_m.F90” file from one of you’re earlier posts. Please re-post if there have been updates.
Thanks,
Mat
% pgfortran -Mcuda -Minfo=accel -c percision_m.F90 host_gen_m1.CUF host_subs_m1.CUF -V14.7 -acc -O2
percision_m.F90:
host_gen_m1.CUF:
host_subs_m1.CUF:
/tmp/pgcudaforGVLdGvteGa6K.gpu(2001): error: argument of type "const char *" is incompatible with parameter of type "void *"
1 error detected in the compilation of "/tmp/pgnvdLWLdVdqHsRtC.nv0".
PGF90-F-0155-Compiler failed to translate accelerator region (see -Minfo messages): Device compiler exited with error status code (host_subs_m1.CUF: 1)
PGF90/x86-64 Linux 14.7-0: compilation aborted
%
% pgfortran -Mcuda -Minfo=accel -c percision_m.F90 host_gen_m1.CUF host_subs_m1.CUF -V14.9 -acc -O2
percision_m.F90:
host_gen_m1.CUF:
host_subs_m1.CUF:
%
Here is the nvcc version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2014 NVIDIA Corporation
Built on Thu_Jul_17_21:41:27_CDT_2014
Cuda compilation tools, release 6.5, V6.5.12
and compiler version
pgf90 14.10-0 64-bit target on x86-64 Linux -tp istanbul
The Portland Group - PGI Compilers and Tools
Copyright (c) 2014, NVIDIA CORPORATION. All rights reserved.
Thanks egodfred. With the full source, I was able to recreate the error. It does not occur with the 15.x compiler or when the OpenACC flag “-ta=nvidia” is removed. Since you don’t have OpenACC code in this file, I’d recommend removing the “-ta” flag.
% pgfortran -Mcuda -Minfo -ta=nvidia -DGLOBAL -c host_subs_m.CUF -V14.10
Stack dump:
0. Running pass 'Simplify Live Out' on function '@host_subs_m_d_local_energy_'
pgnvd-Fatal-/proj/pgi/linux86-64/2014/cuda/6.0/nvvm/bin/cicc TERMINATED by signal 11
Arguments to /proj/pgi/linux86-64/2014/cuda/6.0/nvvm/bin/cicc
/proj/pgi/linux86-64/2014/cuda/6.0/nvvm/bin/cicc -arch compute_20 -m64 -ftz=0 -prec_div=1 -prec_sqrt=1 -fmad=1 /tmp/pgnvdLfcdVgGNULiI.i -o /tmp/pgcudafor-dcd9-4p3-6U.ptx -nvvmir-library /proj/pgi/linux86-64/2014/cuda/6.0/nvvm/libdevice/libdevice.compute_20.10.bc
PGF90-F-0155-Compiler failed to translate accelerator region (see -Minfo messages): Device compiler exited with error status code (host_subs_m.CUF: 1)
PGF90/x86-64 Linux 14.10-0: compilation aborted
% pgfortran -Mcuda -Minfo -DGLOBAL -c host_subs_m.CUF -V14.10
% pgfortran -Mcuda -Minfo -ta=nvidia -DGLOBAL -c host_subs_m.CUF -V15.1
%
pgfortran -Mcuda -Minfo=accel -c precision_m.F90 cpurandom_m.F90 host_gen_m1.CUF host_subs_m1.CUF main.CUF -acc -O2 gpuqmc
precision_m.F90:
cpurandom_m.F90:
host_gen_m1.CUF:
host_subs_m1.CUF:
Stack dump:
0. Running pass ‘Simplify Live Out’ on function ‘@host_subs_m_d_local_energy_’
pgnvd-Fatal-/state/partition1/pgi14/linux86-64/2014/cuda/6.0/nvvm/bin/cicc TERMINATED by signal 11
Arguments to /state/partition1/pgi14/linux86-64/2014/cuda/6.0/nvvm/bin/cicc
/state/partition1/pgi14/linux86-64/2014/cuda/6.0/nvvm/bin/cicc -arch compute_20 -m64 -ftz=1 -prec_div=1 -prec_sqrt=1 -fmad=1 /tmp/pgnvdX7QftwwRupaZ.i -o /tmp/pgcudaforP6Qf7raYX1W0.ptx -nvvmir-library /state/partition1/pgi14/linux86-64/2014/cuda/6.0/nvvm/libdevice/libdevice.compute_20.10.bc
PGF90-F-0155-Compiler failed to translate accelerator region (see -Minfo messages): Device compiler exited with error status code (host_subs_m1.CUF: 1)
PGF90/x86-64 Linux 14.10-0: compilation aborted
main.CUF:
Sorry about that. I only checked the one file and not the whole project. Looks like the problem is actually with the CUDA C code generator. Using LLVM works for me. Can you give this a try?
% pgfortran -Mcuda=llvm -Minfo=accel -c precision_m.F90 cpurandom_m.F90 host_gen_m1.CUF host_subs_m1.CUF main.CUF -acc -O2 gpuqmc -V14.10
precision_m.F90:
cpurandom_m.F90:
host_gen_m1.CUF:
host_subs_m1.CUF:
main.CUF:
It compiles but gives this error at runtime
0: DEV_BIND_TEXTURE: cudaBindTexture failed: 18(invalid texture reference)
I have not see this before and the only references I can find on the web occur when trying to use textures on older devices where textures we’re supported.
Though, I doubt that’s the problem here. Can you post or send to PGI Customer Service (trs@pgroup.com) your data files so I can try to recreate the error? Hopefully I can then determine the cause.
Thanks,
Mat
I have sent the files , hope yo hear from you soon.
Device info:
CUDA Driver Version: 6050
NVRM version: NVIDIA UNIX x86_64 Kernel Module 340.29 Thu Jul 31 20:23:19 PDT 2014
Device Number: 0
Device Name: GeForce GTX 670
Device Revision Number: 3.0
Global Memory Size: 4294770688
Number of Multiprocessors: 7
Number of SP Cores: 1344
Number of DP Cores: 448
Concurrent Copy and Execution: Yes
Total Constant Memory: 65536
Total Shared Memory per Block: 49152
Registers per Block: 65536
Warp Size: 32
Maximum Threads per Block: 1024
Maximum Block Dimensions: 1024, 1024, 64
Maximum Grid Dimensions: 2147483647 x 65535 x 65535
Maximum Memory Pitch: 2147483647B
Texture Alignment: 512B
Clock Rate: 980 MHz
Execution Timeout: No
Integrated Device: No
Can Map Host Memory: Yes
Compute Mode: default
Concurrent Kernels: Yes
ECC Enabled: No
Memory Clock Rate: 3004 MHz
Memory Bus Width: 256 bits
L2 Cache Size: 524288 bytes
Max Threads Per SMP: 2048
Async Engines: 1
Unified Addressing: Yes
Initialization time: 540639 microseconds
Current free memory: 4246446080
Upload time (4MB): 1300 microseconds ( 905 ms pinned)
Download time: 2726 microseconds (1053 ms pinned)
Upload bandwidth: 3226 MB/sec (4634 MB/sec pinned)
Download bandwidth: 1538 MB/sec (3983 MB/sec pinned)
PGI Compiler Option: -ta=tesla:cc30
compiler version:
pgfortran 14.10-0 64-bit target on x86-64 Linux -tp istanbul
The Portland Group - PGI Compilers and Tools
Copyright (c) 2014, NVIDIA CORPORATION. All rights reserved.
NVCC Version:
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2014 NVIDIA Corporation
Built on Thu_Jul_17_21:41:27_CDT_2014
Cuda compilation tools, release 6.5, V6.5.12
Hi Godfred,
When I compile with “-DTEXTURE”, I get many syntax errors. Do I have the most recent version? With “-DGLOBAL”, I get a runtime error since there’s no “lipot.dat” data file. Do I need this file or am I doing something wrong?
Thanks,
Mat
% pgfortran -Mcuda -Minfo=accel precision_m.F90 cpurandom_m.F90 host_gen_m.CUF host_subs_m.CUF main.CUF -acc -O2 -o gpuqmc -DGLOBAL -V15.7
precision_m.F90:
cpurandom_m.F90:
host_gen_m.CUF:
host_subs_m.CUF:
main.CUF:
% gpuqmc
Read Mass of Helium M_he (cm) : 4.002603240000000
Read box length dmax (a.u) : 2.000000000000000
Read Minimum distance between Helium atoms and impurity (a.u) :
0.000000000000000
Read Maximum potential between atoms (a.u) : 1.000000000000000
Read Minimum potential between atoms (a.u) : -700.0000000000000
Read Minimum distance between Helium atoms (a.u) : 1.000000000000000
Read Mass of atomic impurity : 6.941000000000000
Read Wavefunction for imp-he parameter 1 : 1251.550607449640
Read Wavefunction for imp-he parameter 2 : 3.331331768526400
Read Wavefunction for he-he parameter 1 : 3333.489699684990
Read Wavefunction for he-he parameter 2 : 2.6974148946484230E-018
Read Number of Helium atoms : 3
Read Number of walkers : 10240
Read Number of micro Updates : 1
Read Number of macro updates (markov chain walks) : 10000
switch integer to decide where to read initial configuration : 1
Read perturbation step : 0.7000000000000000
Number of Blocks : 1
Base Number : 8
Number of msteps : 1.000000000000000
Reduced mass is : 4627.691078620828
PGFIO-F-217/list-directed read/unit=16/attempt to read past end of file.
File name = lipot.dat formatted, sequential access record = 1
In source file host_gen_m.CUF, at line number 314
Sorry about that , yes you need the lipot.dat file , to generate the Li-He Potential function.
https://dl.dropboxusercontent.com/u/59996494/lipot.dat
Thanks egodfreed, though the link appears to be broken or the file doesn’t exist. Can you double check?
Thanks,
Mat
Sorry about that, the link should be working now, I also included the file in the codes I recently sent.
After reviewing egodfred’s code, this appears to be the as a known problem that we had in the late 14.x releases when multiple shared arrays were being used in the same kernel. The error was fixed in the 15.1 release and were working on getting egodfred updated to this release.