How is the intrinsic copy (=) implemented in PGI Fortran if the host side and the device side arrays have different data precision?
In other words,
REAL(8) :: A(:)
REAL(4), DEVICE :: A_d(:)
A_d = A
Is A implicitly copied to a 4-byte pinned memory buffer and then copied to GPUs?