Is there a way to unroll a simple do loop inside of a CUDA Fortran kernel?
I tried, amongst others,
!pgi$l unroll = n:2
plus
-Munroll
as a compiler switch, but I can’t convince the compiler to unroll the loop.
Am I using the command incorrectly? Or is this feature simply not available (if so: why?)?