Function call mechanism in CUDA

I have been looking at the CUDA binary utilities of the online documentation as I wanted to know more about the actual ISA (out of curiosity).
I have noticed there are several instructions to handle functions calls like JCAL, RET, EXIT, PRET…
Where can I find more detailed information on these instructions?
In particular I am interested in the the function call mechanism and how the function frames are handled, is there a “stack” like on x86 CPUs? Where are the return addresses stored?

You will find some technical details about the memory layout of the stack and local memory here: