The manual says that a “device function is always inlined” and I take that to mean the function body will replace the function call and variables will be expanded. In my program the main section is a loop in which several functions are called. This configuration required 99 registers. Copying the body of these functions into their own block and replacing the original function calls (manual inlining) resulted in 26 fewer registers being used (still not enough to improve the situation but certain an improvement). The signatures of my functions contain “const type”, “const type &”, and “type &” where type is int, float, or one structure.
Is automatic inlining not the same as manual inlining?
Is there anything about how I am passing arguments that might cause additional registers to be consumed?