I tried to unroll some loops to see if I could improve the performance of 2 nested for-loops. I got the compiler message
warning, loop was not unrolled, inline assembly
warning, loop was not unrolled, not innermost loop
What does the first warning mean?
Is it not possible to unroll nested for-loops?
How can I unroll a loop if I don’t know the number of loops at compile time? Can I make some kind of template, such that a kernel is selected at run-time? Will I have a very big executable if I make like 100 templates?