shared memory - double precision C1060

Hi,
CUDA C introduced a way to avoid bank conflict with double precision data.

__shared__ int shared_low[32];
__shared__ int shared_hi[32];

using the following functions

__double2loint()
__double2hiint()
__hiloint2double()

Is there a better way in CUDA Fortran, and/or do these functions implemented in CUDA Fortran?
If possible, could someone provide me a sample code.

Thanks
Tuan

Hi Tuan,

CUDA Fortran doesn’t support these functions but the simple way to avoid bank conflicts is to pad your shared arrays. In other words, use an array size of 33 instead of 32 or 17 instead of 16.

  • Mat