In PTX I can have function calls. But the call graph needs to be finite and computable at compile time : is this correct? And the function calls are inlined when converting from PTX to CUBIN instructions?
The function calls are inlined in the compilation of C to PTX by nvcc, even before the PTX assembly stage.
What happens if I specify the noinline option or if one writes PTX by hand? PTX does have call instructions as well .func labels?
Yup, see the PTX ISA guide. PDF pg. 84 for .func and PDF pg. 70 for call.
Thanks for your response. Well I can see that I am not being very clear here.
So let me rephrase the question.
I know that PTX has call instructions as well as .func labels. I also know in PTX I can do “call fname” where fname has to be a label and cannot be recursive. My question is this : Does the call graph in PTX have to be finite? Is there for example a maximum “stack depth” or can I have an arbitrary call graph? Or is there some other restriction? What about cycles in the call graph for example? Ruling out recursion only rules out an immediate cycle.
I also know that cubin is not disclosed but I am curious about what happens when PTX is converted into cubin. Do function calls get inlined?
The answer to the first part is yes, there is a maximum call depth, as the call instruction says:
“In the current ptx release, parameters are passed through statically allocated ptx registers; i.e., there is no support for recursive calls.”
So the call depth is limited by register availability. As for the second part, I’m not sure if ptxas will perform additional transformations and inline functions that were not already inlined by nvcc. Maybe someone else knows this…
There are native call and return instructions, but ptxas and nvcc prefer inlining. Anyhow, you have to write code as if everything is inlined in all cases. There is no way to do recursion.