I’m currently CUDA-izing a large code and I’m almost to the point where I think I can try it out, but I seem to have hit a bug that, while not a showstopper, isn’t something I want. Namely, if I compile this code with just -Mcuda, it compiles:
> pgfortran -Mcuda=ptxinfo -DMAXNP=72 -DMAXNS=1 -c irrad.iend.F90
ptxas info : Compiling entry function 'irrad'
ptxas info : Used 64 registers, 15712+0 bytes lmem, 36+16 bytes smem, 7536 bytes cmem[0], 496 bytes cmem[1]
ptxas info : Compiling entry function 'irrad'
ptxas info : Used 71 registers, 15696+0 bytes lmem, 36+16 bytes smem, 7536 bytes cmem[0], 512 bytes cmem[1]
Now, true, the lmem is a bit high (and I’m actively trying to lower it) and the cmem number is way, way too high (I’ve only declared 1896 B of constant memory so why it’s up there is unknown), but, again, it at least compiles. (NB: I have an email into PGI Customer Support for that cmem oddity. Not sure what’s the issue there.)
But, when I then add -fast, I get an ICE and a corefile:
> pgfortran -Mcuda=ptxinfo -fast -DMAXNP=72 -DMAXNS=1 -c irrad.iend.F90
PGF90-S-0000-Internal compiler error. unsupported procedure 900 (irrad.iend.F90: 1457)
0 inform, 0 warnings, 1 severes, 0 fatal for irrad
ptxas info : Compiling entry function 'irrad'
pgnvd-Fatal-/opt/pgi/linux86-64/2010/cuda/2.3/bin/ptxas TERMINATED by signal 11
Arguments to /opt/pgi/linux86-64/2010/cuda/2.3/bin/ptxas
/opt/pgi/linux86-64/2010/cuda/2.3/bin/ptxas -arch=sm_10 /tmp/pgcudaforevZdmvxp7Tfb.ptx -maxrregcount=64 -o /tmp/pgcudaforKvZdS_r8AU1F.bin -v
PGF90-F-0000-Internal compiler error. pgnvd job exited with nonzero status code 0 (irrad.iend.F90: 2978)
PGF90/x86-64 Linux 10.8-0: compilation aborted
where line 1547 is the ‘end subroutine’ line of the main procedure.
Okay. So, I was a bit surprised to see the 2.3 in that output as I thought CUDA 3.1 was default now (isn’t it?), so to try one last thing, I added 3.1:
> pgfortran -Mcuda=ptxinfo,3.1 -fast -DMAXNP=72 -DMAXNS=1 -c irrad.iend.F90
PGF90-S-0000-Internal compiler error. unsupported procedure 900 (irrad.iend.F90: 1457)
0 inform, 0 warnings, 1 severes, 0 fatal for irrad
ptxas info : Compiling entry function 'irrad' for 'sm_10'
ptxas info : Used 58 registers, 15828+0 bytes lmem, 36+16 bytes smem, 7536 bytes cmem[0], 476 bytes cmem[1]
ptxas info : Compiling entry function 'irrad' for 'sm_13'
ptxas info : Used 90 registers, 15604+0 bytes lmem, 36+16 bytes smem, 7536 bytes cmem[0], 524 bytes cmem[1]
ptxas info : Compiling entry function 'irrad' for 'sm_20'
ptxas info : Used 63 registers, 4+0 bytes lmem, 68 bytes cmem[0], 7536 bytes cmem[2], 336 bytes cmem[16]
Huh. This time it doesn’t explicitly say it’s aborting compilation and there is no core file generated either (and there’s the cute sm_20 function) but no .o was generated so it did break.
Any ideas of this error, or should I send this to PGI Customer Support?
Thanks,
Matt