This is yet another in a series of my questions arising from converting legacy code to GPUs. In this case, the code I am converting has many input arguments that are POINTERs and within the code, there are statements like:
IF( ASSOCIATED(FIXL) ) FIXL = QLW_AN + QLW_LS
These are used because arrays like FIXL are often diagnostic values that don’t have to be associated, and only are when the user asks for it to be outputted.
What I’m wondering is can I do something similar in CUDA Fortran? That is, let’s say I have FIXL_dev, as an ALLOCATABLE, DEVICE array. Could I then use the ALLOCATED() intrinsic to do the something similar to above? Then, in the driver, I only allocate FIXL_dev, the device copy of the FIXL host array/pointer, if and only if FIXL is ALLOCATED/ASSOCIATED and then inside the device code I use:
IF( ALLOCATED(FIXL_dev) ) FIXL_dev(i,j,k) = QLW_AN_dev(i,j,k) + QLW_LS_dev(i,j,k)
(and, as well, on the host side allow me to not cudaMemcpy the host array over to the GPU).
I’m not sure if this is allowed or not. I know device subprograms can’t have optional arguments, but can the global subroutine have “optionally allocated” arrays like this?
Or will I need to think of something clever like, if FIXL isn’t needed, fill FIXL_dev with trash (like 99999.9) and then use, say,
if( ALLTHREADS(FIXL_dev(i,j,k) /= 99999.9 ) FIXL_dev(i,j,k) = QLW_AN_dev(i,j,k) + QLW_LS_dev(i,j,k)
I’d have the transfer cost, but at least I could avoid the calculation.