CUDA Fortran Derived Type with Device vs Managed Attribute

Don_Kenzakowski · May 16, 2024, 4:50pm

I am trying to understand in Cuda Fortran why a derived type containing only device allocatable arrays needs to have a managed attribute rather than allowing for a device attribute. Copy of sample test code follows. The derived type in question is extracted here:

type:: cs1
real,device,allocatable,dimension(:,:):: s
real,device,allocatable,dimension(:) :: cfl
end type cs1
type(cs1),managed,allocatable,dimension(:):: edge_stream ! This works ok
!!type(cs1),device,allocatable,dimension(:):: edge_stream ! This causes core dump

Thank you.

cuda_stream_module_f90.txt (5.1 KB)

MatColgrove · May 16, 2024, 5:19pm

This part:

        allocate(edge_stream(n)%s(m1,m2))
        allocate(edge_stream(n)%cfl(m2))

If “edge_stream” is a device array then it can’t be accessed on the host. Hence when you try to allocate the data members, the host seg faults given “edge_stream” needs to be accessed.

“managed” memory is accessible on both the host and device, and why it works.

Don_Kenzakowski · October 4, 2024, 7:04pm

Is there a way I can allocate the array elements within the device-attribute derived type without resorting
to using managed memory?

I tried making an attributes(global) alloc_gpu() kernel of just one thread to try
and allocate the array elements, but compiler would not permit the procedure.

Don_Kenzakowski · October 4, 2024, 7:43pm

I noticed (using above example) that if I assign edge_stream the “managed” attribute but the allocatable
array elements (s,cfl) the “device” attribute, it compiles and runs successfully.

Moreover, I seem to also to be able to use a local pointer (e_s) with device routines that point to
the managed array edge_stream. Am I doing this memory management efficiently?

MatColgrove · October 4, 2024, 8:17pm

Correct, only the parent type needs to be managed. The allocatable array members can have the device attribute.

Again, the problem is that device data allocation can only be initiated from the host. Hence if the parent object had “device”, then when you go to allocate the members, the allocation will seg fault given the parent’s address is deferenced on the host, and it segvs.

Moreover, I seem to also to be able to use a local pointer (e_s) with device routines that point to
the managed array edge_stream.

The pointer assignment will be to the managed memory address, but that can be used as a device address so should be fine.

Am I doing this memory management efficiently?

I don’t have enough information to say, but managed memory is more about ease of use. Though like all data management, the most efficient thing is to have the program copy the data to the device at the beginning, perform all computation on that data on the device, and then bring the results back at the end. If you need the data back on the host during the run, that’s fine, but you do add in data movement cost.

The key difference with managed, is that the CUDA driver implicitly does the data movement but only when the data is “dirty”. So if you don’t touch the data on the host, the data doesn’t get copied back.

Topic		Replies	Views
member of derived type in CUF kernels Legacy PGI Compilers	2	2522	May 24, 2019
How to use managed nested drived type in CUDA Fortran nvc, nvc++ and nvfortran	4	262	March 6, 2024
What is the correct way of working with arrays of derived types with allocatable components? nvc, nvc++ and nvfortran	7	150	June 27, 2025
Fortran derived type on device CUDA Programming and Performance	1	727	April 20, 2023
Managed data type CUDA fortran example CUDA Programming and Performance	2	426	September 11, 2021
how to use the derived type in kernel subroutine Legacy PGI Compilers	5	2105	January 16, 2024
How to use derived type in cuda fortran kernel nvc, nvc++ and nvfortran kernel	2	507	January 16, 2024
cuda fortran about type Legacy PGI Compilers	1	3216	December 3, 2014
Allocatable derived types Legacy PGI Compilers	1	5125	March 2, 2012
Fortran derived type and allocatable data Legacy PGI Compilers	4	5990	February 15, 2014

CUDA Fortran Derived Type with Device vs Managed Attribute

Related topics