Is there an equivalent of launch_bounds() in CUDA FORTRAN? Please provide a usage example if there is.
Not yet, though we do have an open feature request for it (TPR#19302). I’ll add you to the list of requesters.
Could you add me to that list as well? Thanks.
Some simple support for launch bounds is available in 2020 compilers. Here is the syntax:
attributes(global) launch_bounds(256,8) subroutine test(a,b,c,n)