Hi Guys,
this is a probably a newbie question. I have a very simple piece of code:
In the .cuh file, I defined a few functions, one of them is
device void myfunction(float* Args,short* param)
{
Args=&somearray[0];
}
I defined as well a variable myfloatArray as
device float* myfloatArray;
my kernel is in this format :
extern “C” global void myKernel(float* arrayfloat)
{
[indent]int num_thread=blockIdx.x * blockDim.x + threadIdx.x;
if (num_thread==0)
{
[indent]myfloatArray=arrayfloat;
short param=1;
myfunction(myfloatArray,¶m);[/indent]
}[/indent]
}
I just limited num_thread==0 so that only 1 trade goes through the code to debug it quickly with Nsight. The reason why I am
using a variable myfloatArray is to make the pointer arrayfloat available globally to all the device functions I want to
write and use in my kernel without passing the pointer to these functions ( maybe not the smartest solution but it seems to work).
My problem is when I debug and step into myfunction, the debugger does not even enter in the function that already on the first line it throws
this error :
Parallel Nsight Debug
Detected data stack underflow on 1 threads. First thread:
blockIdx = {0,0,0}
threadIdx = {0,0,0}
StackPointer = 0x00ffae00
StackLimit = 0x00fff840
PC = 0x0010ada8
FunctionRelativePC = 0x00000aa8
Would you have any idea of what I did wrong ?