Is there support for launch_bounds in CUDA Fortran? Any plans to add it?
Our engineers took a look and don’t think it would be difficult to add. Hence, I added feature request (TPR#19161). No eta yet, but hopefully it wont take too long.