A meaning of nvlink warning: Stack size for entry function cannot be statically determined

Dalibor_CZ · November 30, 2012, 8:06am

Hallo,
I am currently developing a code for cuda, which is quite simple. It consists of kernel call with input parameters, some declarations and allocations inside the kernel with two subsequent while loops inside it. Inside these two while loops a device function is launched, which as well posses some declarations and allocations and contain one another while loop inside it (minimal code below).

The problem is, that during compilation nvcc gives me a warning:

nvlink warning : Stack size for entry function ‘_Z11BlockKernelPK10cuLoopDataPK10cuBiphotonP9cuCrystalPK8cuPhotonS9_S9_P10cuMatrix3dIdESC_iidPK8cuMatrixIdE’ cannot be statically determined.

Can someone please explain me what does this warning mean to me a how can I get rid of it?

Thanks a lot,

Dalibor

Kernel (inptPar1, ..., inptParN){
  ... declarations, allocations, initializations
  while(cond1){
    while(cond2){
      threadFunction(some parameters from kernel parameters list and addtional parameters declared, allocated and intialized in Kernel);

    }
  }
}

threadFunction(some parameters from kernel parameters list and parameters declared, allocated and intialized in Kernel){
  ... declarations, allocations, initializations
  while(cond3){
    code with some if statemets, allocations and freeing of device memory
  }
}

njuffa · November 30, 2012, 9:01pm

I asked the linker team for insight. That message typically means that there is recursion in your code. In the absence of recursion, the compiler can normally determine stack usage at compile time and this data can then be communicated to the driver so it can size the stack accordingly.

When there is recursion, the compiler cannot accurately determine the required stack size as the depth of the recursion is unknown at compile time and your kernel may experience stack overflow at runtime. In which case you would have to manually increase the stack size through the appropriate CUDA API.

Dalibor_CZ · December 3, 2012, 8:47am

Thank you for your reply. I was looking for recursion function and I have successfully found it. Now I am going to fix it.

Thanks again,

Dalibor

njuffa · December 3, 2012, 3:54pm

Note that recursion is supported, at least on newer GPU architectures where CUDA uses an ABI and a proper stack. The compiler just makes you aware of the fact that you may have to manually adjust the stack size configured via the driver, as it cannot determine total stack usage at compile time. It may still be a good idea to remove the recursion for performance reasons.