CUDA Fortran + float3/float4

sWienke · April 6, 2011, 12:10pm

Hello,
using NVIDIA’s CUDA C, there are built-in vector data types as float3 and float4 (which promise good memory access pattern and alignment, as far as I know).

Does CUDA Fortran have analogous derived types?
If I do it manually (see below), then I don’t know how to ensure correct alignment…

type :: float4
  sequence
  real*4:: x,y,z,w
end type float4

Cheers, Sandra

MatColgrove · April 7, 2011, 5:11pm

Hi Sandra,

No, CUDA Fortran does support these vector types. Though since Fortran allows you to perform operation on whole arrays, I’m wondering if they are necessary. Wouldn’t declaring a 3 or 4 element array work?

Mat

MatColgrove · April 11, 2011, 8:35pm

Hi Sandra,

Here’s the response from Michael:

The vector data types (float3, float4) are important when programming
with OpenCL for the ATI, but aren’t needed for good performance on
NVIDIA. They are used in CUDA mostly for texture and surface references.

We don’t have an analog to these vector data types in CUDA Fortran.

Mat

sWienke · April 13, 2011, 6:39am

Hi,
Thanks for the information. Two more comments from my side. We used the float4 type with CUDA C and got better performance on several NVIDIA GPUs (although not Fermi, I believe). I think, float4 types are aligned which could give a better performance in some cases. If I use Fortran’s array operations instead, it should be almost the same, but what is about this alginment in Fortran?
Thus, I think, there could be differences.
Bye, Sandra

tty103 · April 13, 2011, 8:10pm

I would guess nvcc could put all four elements in one segment continuously if four threads will access the four elements. If one thread will access all four element, nvcc could put the four elements in the same location but in four memory segments. In fortran, user has to do the memory management.
Say if there are 10 ponts, we could do a my_pnt(1:10)%xyzw(1:4) style
or x(1:10), y(1:10), z(1:10) and w(1:10) style, depending on how you plan to access them.

Topic		Replies	Views
CUDA Fortran aggregate data types Legacy PGI Compilers	1	6580	January 21, 2010
Fortran + C + CUDA CUDA Remapping Fortran arrays to C fashion? CUDA Programming and Performance	6	2070	January 12, 2010
CUDA Fortran - Align attribute for allocatable arrays Legacy PGI Compilers	1	2543	October 10, 2012
Best practice with CUDA vector types. CUDA Programming and Performance	4	4007	April 4, 2013
Roadmap for CUDA Fortran? nvc, nvc++ and nvfortran	7	1251	June 13, 2022
Fortran Interface + CUDA C Call - 3D Array. CUDA Programming and Performance	2	869	February 24, 2011
Fortran, CUDA and OpenCL : Where are we? CUDA Programming and Performance	1	1140	June 10, 2009
Strong typing and memory copy Legacy PGI Compilers	7	13310	March 29, 2010
Linear Layout of threads in Fortran Legacy PGI Compilers	3	3872	November 11, 2010
Using C-pointer for 3D Fortran array Legacy PGI Compilers	4	5402	February 24, 2014

CUDA Fortran + float3/float4

Related topics