Shared memory array pointers in CUDA

Since in CUDA I can only have one shared memory array per kernel, I was wondering if there was a way to have pointers to different parts of the array?

I’ve been trying to setup a large array from which I can have smaller sub-arrays, as described here: http://stackoverflow.com/questions/5369799/can-i-create-sub-arrays-in-fortran-using-pointers. I was wondering if I could do the same for shared memory arrays?

I have a shared memory array called ‘alldata’ which has 9 elements, and I want to create 3 smaller sub arrays. ‘left’ consists of the first 3 elements of ‘alldata’, centre consists of the next 3 elements and ‘right’ consists of the last 3 elements. In the code I’ve written I have a kernel which is launched with one block of 3 threads. This is the code I’ve written:

module test
  implicit none
  
  real, device, dimension(3) :: d_left
  real, device, dimension(3) :: d_centre
  real, device, dimension(3) :: d_right

  real, dimension(3)         :: h_left
  real, dimension(3)         :: h_centre
  real, dimension(3)         :: h_right
  
contains

  attributes(global) subroutine shared_sub_arrays()

    integer :: i

    real, shared, dimension(*), target :: alldata
    real, shared, dimension(:), pointer :: left
    real, shared, dimension(:), pointer :: centre
    real, shared, dimension(:), pointer :: right

    i = threadIdx%x
    
    left   => alldata(1:3)
    centre => alldata(4:6)
    right  => alldata(7:9)    

    left(i) = 1.0
    centre(i) = 2.0
    right(i) = 3.0
        
    d_left(i)   = left(i)
    d_centre(i) = centre(i)
    d_right(i)  = right(i)
    
  end subroutine shared_sub_arrays

end module test
  
program shared_test
  use test
  implicit none

  ! Populate arrays
  call shared_sub_arrays<<<1,3,3*sizeof(real(1.0))>>>()

  ! Copy arrays from device to host
  h_left   = d_left
  h_centre = d_centre
  h_right  = d_right

  ! Print arrays
  print *, h_left
  print *, h_centre
  print *, h_right
  
end program shared_test

Unfortunately it won’t compile, as the target and pointer attributes conflict with the shared attribute:


phrkaj@larch:~/Projects/cuda/cuda-test> pgf90 -Mcuda shared_arrays.f90
PGF90-S-0134-Illegal attribute - conflict with target (shared_arrays.f90: 18)
PGF90-S-0134-Illegal attribute - conflict with pointer (shared_arrays.f90: 19)
PGF90-S-0134-Illegal attribute - conflict with pointer (shared_arrays.f90: 20)
PGF90-S-0134-Illegal attribute - conflict with pointer (shared_arrays.f90: 21)
  0 inform,   0 warnings,   4 severes, 0 fatal for shared_sub_arrays

Is there any way round this? I could use just one array with offset indices to represent the sub arrays, but that would make things a lot more complicated so I was hoping I could do it this way.

Thanks in advance for the help.

Hi Tom,

Since in CUDA I can only have one shared memory array per kernel, I was wondering if there was a way to have pointers to different parts of the array?

Provided, they are fixed size, you can have multiple shared arrays. You can have only one shared array who’s size is dynamically set when the kernel is launched. I rewrote your example below to use fixed sized shared arrays.

Hope this helps,
Mat



% cat test.cuf
module test
  implicit none
 
  real, device, dimension(3) :: d_left
  real, device, dimension(3) :: d_centre
  real, device, dimension(3) :: d_right

  real, dimension(3)         :: h_left
  real, dimension(3)         :: h_centre
  real, dimension(3)         :: h_right
 
contains

  attributes(global) subroutine shared_sub_arrays()

    integer :: i

    real, shared, dimension(3) :: left
    real, shared, dimension(3) :: centre
    real, shared, dimension(3) :: right

    i = threadIdx%x
   
    left(i) = 1.0
    centre(i) = 2.0
    right(i) = 3.0
       
    d_left(i)   = left(i)
    d_centre(i) = centre(i)
    d_right(i)  = right(i)
   
  end subroutine shared_sub_arrays

end module test
 
program shared_test
  use test
  implicit none

  ! Populate arrays
  call shared_sub_arrays<<<1,3>>>()

  ! Copy arrays from device to host
  h_left   = d_left
  h_centre = d_centre
  h_right  = d_right

  ! Print arrays
  print *, h_left
  print *, h_centre
  print *, h_right
 
end program shared_test 
% pgf90 test.cuf ; a.out
    1.000000        1.000000        1.000000    
    2.000000        2.000000        2.000000    
    3.000000        3.000000        3.000000    
%