Adding -fast leads to ICE with CUDA Fortran Program

I’m currently CUDA-izing a large code and I’m almost to the point where I think I can try it out, but I seem to have hit a bug that, while not a showstopper, isn’t something I want. Namely, if I compile this code with just -Mcuda, it compiles:

> pgfortran -Mcuda=ptxinfo -DMAXNP=72 -DMAXNS=1 -c irrad.iend.F90 
ptxas info    : Compiling entry function 'irrad'
ptxas info    : Used 64 registers, 15712+0 bytes lmem, 36+16 bytes smem, 7536 bytes cmem[0], 496 bytes cmem[1]
ptxas info    : Compiling entry function 'irrad'
ptxas info    : Used 71 registers, 15696+0 bytes lmem, 36+16 bytes smem, 7536 bytes cmem[0], 512 bytes cmem[1]

Now, true, the lmem is a bit high (and I’m actively trying to lower it) and the cmem number is way, way too high (I’ve only declared 1896 B of constant memory so why it’s up there is unknown), but, again, it at least compiles. (NB: I have an email into PGI Customer Support for that cmem oddity. Not sure what’s the issue there.)

But, when I then add -fast, I get an ICE and a corefile:

> pgfortran -Mcuda=ptxinfo -fast -DMAXNP=72 -DMAXNS=1 -c irrad.iend.F90
PGF90-S-0000-Internal compiler error. unsupported procedure     900 (irrad.iend.F90: 1457)
  0 inform,   0 warnings,   1 severes, 0 fatal for irrad
ptxas info    : Compiling entry function 'irrad'
pgnvd-Fatal-/opt/pgi/linux86-64/2010/cuda/2.3/bin/ptxas TERMINATED by signal 11
Arguments to /opt/pgi/linux86-64/2010/cuda/2.3/bin/ptxas
/opt/pgi/linux86-64/2010/cuda/2.3/bin/ptxas -arch=sm_10 /tmp/pgcudaforevZdmvxp7Tfb.ptx -maxrregcount=64 -o /tmp/pgcudaforKvZdS_r8AU1F.bin -v
PGF90-F-0000-Internal compiler error. pgnvd job exited with nonzero status code       0 (irrad.iend.F90: 2978)
PGF90/x86-64 Linux 10.8-0: compilation aborted

where line 1547 is the ‘end subroutine’ line of the main procedure.

Okay. So, I was a bit surprised to see the 2.3 in that output as I thought CUDA 3.1 was default now (isn’t it?), so to try one last thing, I added 3.1:

> pgfortran -Mcuda=ptxinfo,3.1 -fast -DMAXNP=72 -DMAXNS=1 -c irrad.iend.F90
PGF90-S-0000-Internal compiler error. unsupported procedure     900 (irrad.iend.F90: 1457)
  0 inform,   0 warnings,   1 severes, 0 fatal for irrad
ptxas info    : Compiling entry function 'irrad' for 'sm_10'
ptxas info    : Used 58 registers, 15828+0 bytes lmem, 36+16 bytes smem, 7536 bytes cmem[0], 476 bytes cmem[1]
ptxas info    : Compiling entry function 'irrad' for 'sm_13'
ptxas info    : Used 90 registers, 15604+0 bytes lmem, 36+16 bytes smem, 7536 bytes cmem[0], 524 bytes cmem[1]
ptxas info    : Compiling entry function 'irrad' for 'sm_20'
ptxas info    : Used 63 registers, 4+0 bytes lmem, 68 bytes cmem[0], 7536 bytes cmem[2], 336 bytes cmem[16]

Huh. This time it doesn’t explicitly say it’s aborting compilation and there is no core file generated either (and there’s the cute sm_20 function) but no .o was generated so it did break.

Any ideas of this error, or should I send this to PGI Customer Support?

Thanks,
Matt

Hi Matt

Okay. So, I was a bit surprised to see the 2.3 in that output as I thought CUDA 3.1 was default now (isn’t it?)

No, the default is still 2.3.

Any ideas of this error, or should I send this to PGI Customer Support?

Send it customer support and ask them to send it to me, or just send it to me directly. I’m off next week on vacation but will look at it today. I’ll also ask Dave about the cmem issue.

Thanks,
Mat

Okay. I guess the bleeding-edge man in me thinks newer is always default…much to the chagrin of all my sysadmins.

Any ideas of this error, or should I send this to PGI Customer Support?

Send it customer support and ask them to send it to me, or just send it to me directly. I’m off next week on vacation but will look at it today. I’ll also ask Dave about the cmem issue.

On its way! Thank you for your help.

Matt

You can set the default CUDA Toolkit version to use (which is different from the CUDA driver version) when compiling CUDA Fortran or PGI Accelerator programs. If you have write permission to the PGI installation directory, you can create a file ‘siterc’ in the bin/ directory containing the pgf90 driver and all the other ‘rcfiles’ (pgf90rc, etc.), or edit one if it’s already there. Add a line
set DEFCUDAVERSION=3.1;
to make the default 3.1, or make it 3.0 if that’s what you want.

If you don’t have write permissions, or you don’t want to change the default for all users, you can create a file in your home ($HOME) directory named .mypgirc (for Windows, it will be just mypgirc, no ‘dot’), and add the same line. This file is used for all PGI compilers.