Recently, when I turn on the flag -Minfo, I see a lot of ‘copyin’ and ‘copyout’ for each functional call.
setup_ryr_2:
2825, Possible copy in and copy out of akrm in call to getcompk_4
2826, Possible copy in and copy out of akrp in call to getcompk_4
put_sfu2grid3d_nonuniform_rogue_ryr:
2246, Copy in and copy out of sfu_coords in call to valid_sfu_location
2247, Copy in and copy out of sfu_coords in call to map_loc2grid
2283, Copy in and copy out of sfu_coords in call to valid_sfu_location
2284, Copy in and copy out of sfu_coords in call to map_loc2grid
I’m not using Accelerator programming model, but CUDA Fortran. So, my question is how can I optimize the code, written in CUDA Fortran, to specify the compiler when to do copyin and/or copyout?
Can it be done using clauses like in Fortran Accelerator? Please give me an example if it can be done.
Thanks,
Tuan