Hi, I need help with passing arguments to device functions. Simply, what is giving me trouble is the following case:
I have 2 device functions: device void foo1( float* a )
{
…
foo2( a ); //Notice I have not used the “&” this time as “a” is already a pointer.
…
} device void foo2( float* b )
{
…
*b = *b + 2;
…
}
In the kernel I have the following: global void kernel ( … )
{
float x;
…
foo1( &x);
…
}
Well, the code doesn’t work. While debugging I noticed that when calling foo1(), the parameter (address of x) is passed ok, but when foo1() calls foo2() there is a problem. What I see is that b is placed in a register. I can’t see its content.
Any idea? Is it recommended not to pass arguments by reference?
Thanks Cliff for your reply. The code is too long to put it here. I’m using CUDA Toolkit 3.1. I just want to know if passing arguments this way should work. When debugging, I see that some arguments (pointers actually) are in a register, and I can’t see their content, so I don’t know whether they are being passed ok, or not. I’ll try to put a portion of the code here.
Thanks Cliff for your reply. The code is too long to put it here. I’m using CUDA Toolkit 3.1. I just want to know if passing arguments this way should work. When debugging, I see that some arguments (pointers actually) are in a register, and I can’t see their content, so I don’t know whether they are being passed ok, or not. I’ll try to put a portion of the code here.
What happens if you move foo2 in front of foo1? Inlined functions are supposed to precede the calling functions, and all functions are inlined by default under CUDA.
Apart from the inlining issue, you might need a function prototype for foo2 to indicate it takes a float* as argument, not an int.
What happens if you move foo2 in front of foo1? Inlined functions are supposed to precede the calling functions, and all functions are inlined by default under CUDA.
Apart from the inlining issue, you might need a function prototype for foo2 to indicate it takes a float* as argument, not an int.
Sorry for not posting the complete code. Actually, the prototypes are defined before first function call for all the functions. I had once that problem and learnt the lesson.
I think I solved the problem. It was just some constant values that had wrong values.
And I say “I THINK”, because I have another problem. I cannot debug the hole code. If I compile the hole code with de -g and -G options, everything goes ok, but when I try to debug the code, it seams symbols are missing, because when breaking in the kernel, it says “Single stepping until exit from function Z3LORP3argPfS1_S1_S1”, and I cannot trace the code. So, what I have to do is remove parts of the code, calls to some “device functions”, and that way I can single step. The problem is that I need all the code to check that everything works ok.
Any idea why the compiler may not be including debug symbols?
Sorry for not posting the complete code. Actually, the prototypes are defined before first function call for all the functions. I had once that problem and learnt the lesson.
I think I solved the problem. It was just some constant values that had wrong values.
And I say “I THINK”, because I have another problem. I cannot debug the hole code. If I compile the hole code with de -g and -G options, everything goes ok, but when I try to debug the code, it seams symbols are missing, because when breaking in the kernel, it says “Single stepping until exit from function Z3LORP3argPfS1_S1_S1”, and I cannot trace the code. So, what I have to do is remove parts of the code, calls to some “device functions”, and that way I can single step. The problem is that I need all the code to check that everything works ok.
Any idea why the compiler may not be including debug symbols?