Translating optional arguments to CUDA Fortran

This is yet another in a series of my questions arising from converting legacy code to GPUs. In this case, the code I am converting has many input arguments that are POINTERs and within the code, there are statements like:

IF( ASSOCIATED(FIXL) )  FIXL = QLW_AN + QLW_LS

These are used because arrays like FIXL are often diagnostic values that don’t have to be associated, and only are when the user asks for it to be outputted.

What I’m wondering is can I do something similar in CUDA Fortran? That is, let’s say I have FIXL_dev, as an ALLOCATABLE, DEVICE array. Could I then use the ALLOCATED() intrinsic to do the something similar to above? Then, in the driver, I only allocate FIXL_dev, the device copy of the FIXL host array/pointer, if and only if FIXL is ALLOCATED/ASSOCIATED and then inside the device code I use:

IF( ALLOCATED(FIXL_dev) )  FIXL_dev(i,j,k) = QLW_AN_dev(i,j,k) + QLW_LS_dev(i,j,k)

(and, as well, on the host side allow me to not cudaMemcpy the host array over to the GPU).

I’m not sure if this is allowed or not. I know device subprograms can’t have optional arguments, but can the global subroutine have “optionally allocated” arrays like this?

Or will I need to think of something clever like, if FIXL isn’t needed, fill FIXL_dev with trash (like 99999.9) and then use, say,

if( ALLTHREADS(FIXL_dev(i,j,k) /= 99999.9 ) FIXL_dev(i,j,k) = QLW_AN_dev(i,j,k) + QLW_LS_dev(i,j,k)

I’d have the transfer cost, but at least I could avoid the calculation.

Thanks,
Matt

Hi Matt,

Well, support for POINTER is still on the ‘coming features’ list so ASSOCIATED won’t work. Also ALLOCATED only works from the host, so that’s out as well.

I think the ALLTHREADS version would work but might be overkill. It seems to me that if you’ve gone to all the trouble of copying the FIXL_dev array over to the device and that ALLTHREADS can slow you down, you might as well just go ahead do the computation.

Another option would be to use an integer in constant memory as your guard.

if( DO_FIXL ) FIXL_dev(i,j,k) = QLW_AN_dev(i,j,k) + QLW_LS_dev(i,j,k)



I know device subprograms can’t have optional arguments,

Actually, routines with the device attribute (not global) should support optional arguments so long as they are not passed by value.

  • Mat

Reviving a bit of a dead thread, I was just wondering if there had been any changes in this sort of idea with CUDA 4? That is, can we tell if a host pointer is ASSOCIATED with, say, UVA?

Or can we use ALLOCATED(temp_dev) to tell if a device variable has been allocated by the host?

I’m just looking for ways to make my combined CPU/GPU code a bit more readable.

Thanks,
Matt