A meaning of nvlink warning: Stack size for entry function cannot be statically determined

Hallo,
I am currently developing a code for cuda, which is quite simple. It consists of kernel call with input parameters, some declarations and allocations inside the kernel with two subsequent while loops inside it. Inside these two while loops a device function is launched, which as well posses some declarations and allocations and contain one another while loop inside it (minimal code below).

The problem is, that during compilation nvcc gives me a warning:

nvlink warning : Stack size for entry function ‘_Z11BlockKernelPK10cuLoopDataPK10cuBiphotonP9cuCrystalPK8cuPhotonS9_S9_P10cuMatrix3dIdESC_iidPK8cuMatrixIdE’ cannot be statically determined.

Can someone please explain me what does this warning mean to me a how can I get rid of it?

Thanks a lot,

Dalibor

Kernel (inptPar1, ..., inptParN){
  ... declarations, allocations, initializations
  while(cond1){
    while(cond2){
      threadFunction(some parameters from kernel parameters list and addtional parameters declared, allocated and intialized in Kernel);

    }
  }
}

threadFunction(some parameters from kernel parameters list and parameters declared, allocated and intialized in Kernel){
  ... declarations, allocations, initializations
  while(cond3){
    code with some if statemets, allocations and freeing of device memory
  }
}

I asked the linker team for insight. That message typically means that there is recursion in your code. In the absence of recursion, the compiler can normally determine stack usage at compile time and this data can then be communicated to the driver so it can size the stack accordingly.

When there is recursion, the compiler cannot accurately determine the required stack size as the depth of the recursion is unknown at compile time and your kernel may experience stack overflow at runtime. In which case you would have to manually increase the stack size through the appropriate CUDA API.

Thank you for your reply. I was looking for recursion function and I have successfully found it. Now I am going to fix it.

Thanks again,

Dalibor

Note that recursion is supported, at least on newer GPU architectures where CUDA uses an ABI and a proper stack. The compiler just makes you aware of the fact that you may have to manually adjust the stack size configured via the driver, as it cannot determine total stack usage at compile time. It may still be a good idea to remove the recursion for performance reasons.