Constants and Parameters: Simplifying CPU/GPU Code

All,

Currently in some shared CPU/GPU code I’m using, I do something like this to load something into constant memory on a GPU, while not on a CPU:

constants.F90:
module sorad_constants
   implicit none
   real, dimension(5) :: wk_uv = &
      [0.0, 0.0, 0.0, 0.0, 0.00075]
end module

kernel.F90:
#ifdef _CUDA
real, dimension(5), constant :: wk_uv
#else
use sorad_constants, only: wk_uv
#endif

driver.F90:
#ifdef _CUDA
  use sorad_constants, only: wk_uv_const=>wk_uv
  wk_uv = wk_uv_const
#else
  ! nothing, we use the variable in the kernel
#endif

However, I was wondering if there is a nicer way to do this. For example, in my current kernel.F90, I declare some parameters, so I thought I could try making that module constant variable a parameter:

constants.F90:
module sorad_constants
   implicit none
   real, dimension(5), parameter :: wk_uv = &
      [0.0, 0.0, 0.0, 0.0, 0.00075]
end module

kernel.F90:
! both CPU and GPU use it
use sorad_constants, only: wk_uv

driver.F90:
  ! nothing, we use the variable in the kernel

but when I do that pgf902 throws a corefile:

pgfortran-Fatal-/opt/pgi/linux86-64/15.7/bin/pgf902 TERMINATED by signal 11
Arguments to /opt/pgi/linux86-64/15.7/bin/pgf902
/opt/pgi/linux86-64/15.7/bin/pgf902 /tmp/pgfortrann90NGqVhFLt.ilm -fn sorad.F90 -opt 2 -terse 1 -inform warn -x 51 0x20 -x 119 0xa10000 -x 122 0x40 -x 123 0x1000 -x 127 4 -x 127 17 -x 19 0x400000 -x 28 0x40000 -x 120 0x10000000 -x 70 0x8000 -x 122 1 -x 125 0x20000 -x 117 0x1000 -quad -vect 56 -y 34 16 -x 34 0x8 -y 19 8 -y 35 0 -x 42 0x30 -x 39 0x40 -x 199 10 -x 39 0x80 -x 34 0x400000 -x 149 1 -x 150 1 -x 59 4 -x 6 0x100 -y 129 2 -tp px -x 120 0x1000 -x 124 0x1400 -y 15 2 -x 57 0x3b0000 -x 58 0x48000000 -x 49 0x100 -x 120 0x200 -astype 0 -x 121 1 -ieee 1 -x 124 1 -accel tesla -x 186 0x80000 -x 180 0x400 -x 180 0x4000000 -x 121 0xc00 -x 194 0x40000 -x 163 0x1 -x 186 0x80000 -x 180 0x400 -x 180 0x4000000 -cudaver 6.5 -x 121 0xc00 -x 194 0x40000 -x 175 72 -x 176 0x100 -cudacap 20 -cudacap 20 -x 189 0x8000 -y 163 0xc0000000 -x 189 0x10 -y 189 0x4000000 -x 9 1 -x 42 0x14200000 -x 72 0x1 -x 136 0x11 -quad -x 119 0x10000000 -x 129 0x40000000 -x 129 2 -x 164 0x1000 -x 42 0x400000 -y 129 4 -x 129 0x400 -x 137 1 -x 180 0x4000000 -x 163 0x40 -y 186 0x2000000 -y 201 0x08 -x 201 0x04 -x 163 0x400000 -x 175 72 -x 176 0x100 -cudacap 20 -cudacap 20 -cudaver 6.5 -x 163 0x40 -y 186 0x2000000 -y 201 0x08 -x 201 0x04 -x 0 0x1000000 -x 2 0x100000 -x 0 0x2000000 -x 161 16384 -x 162 16384 -x 62 8 -x 124 0x40 -x 24 1 -cmdline '+pgfortran ...' -ccff /tmp/pgfortranv90-U27fdtN.ccff -asm /tmp/pgfortranT90h_EfM1l2.sm

so that’s probably not allowed. If I forget to use “parameter” I get:

PGF90-S-0520-Host MODULE data cannot be used in a DEVICE or GLOBAL subprogram - wk_uv (sorad.F90: 575)
PGF90-S-0520-Host MODULE data cannot be used in a DEVICE or GLOBAL subprogram - wk_uv (sorad.F90: 585)

which obviously makes sense. Thus, ‘parameter’ matters.

Note, I know this compiles (haven’t tried running it yet):

constants.F90:
module sorad_constants

#ifdef _CUDA
#define _CONSTANT constant
#else
#define _CONSTANT parameter
#endif

   implicit none
   real, dimension(5), _CONSTANT :: wk_uv = &
      [0.0, 0.0, 0.0, 0.0, 0.00075]
end module

kernel.F90:
! both CPU and GPU use it
use sorad_constants, only: wk_uv

driver.F90:
  ! nothing, we use the variable in the kernel

but there are still some #ifdef’s I was hoping to lose.

Obviously a pure parameter might not get put in to GPU constant memory, but if there isn’t a performance hit, I’ll take code cleanliness.

Am I violating some CUDA rule? As I said, in my main kernel module, both codes do:

   integer, parameter :: nu = 43
   integer, parameter :: nw = 37

so I figured parameters are okay to “share”. Is it because wk_uv is an array? Because it’s in a module?

Hi Matt,

I asked Brent but we can’t think of anything better than using the ifdef approach. Though, we’ll ponder it a bit more and see what we can come up with.

Could you send me a reproducer for the pgf902 segv? That’s definitively a bug.

Thanks,
Mat

Reproducer sent!

Thanks Matt. I added TPR#22200 for the segv.

  • Mat