Can evaluation of common parameter during kernel execution have a significant detrimental effect?

A kernel I’ve been given to optimise has lots of if-then-else statements which evaluate a common parameter, i.e. the value of that parameter is always the same for all threads but can take upto N different values so all threads take one of N different paths. As a consequence all the threads will execute in parallel. ie no warp divergence, because the value of that parameter will be the same for all threads, but the calculation the threads take depends on the value of that parameter. But can the time taken to evaluate that parameter be significant? Particularly if there are lots of if-then-else statements to evaluate the parameter?

I assume that all threads evaluate this parameter before a decision is taken at runtime which path the warp will take. How long does this evaluation take?

These evaluations at runtime need not take place and the particular algorithm to adopt depending on that parameter can be made at compile time through #if-#else-#endif statements to insert only the correct code.

I am a big proponent of using real-world experiments rather than gedankenexperiments to determine the impact of particular code designs. The surest way to find out is to implement the code, measure / profile, then try various optimization strategies. It is impossible to say how much impact the (repeated?) checking of the parameter at runtime will have. Depending on context, the impact could be in the measurement noise, or it could be significant.

If N is fairly small, I would suggest templatizing the kernel, with the parameter in question as a template argument.