Declaration problems of __global__, __device__ Confused about declarations

Hi,

I try to implement a program in CUDA which I have done in C++ before. I wouldn’t call me a really experienced programmer, so it can be, that my problem is a basic understanding problem of the declaration.

The program I want to implement is quite complex, so, to not loose the overview, I would like to organize it in several files and functions. These functions include partly C (or C++) code or CUDA code.

I tried some basic function calls to get used how to define my functions, but I get error messages, which I don’t understand.

The code I post is not doing anything, I it just to test the function calls

This is my main.cu which should just call a C function whihch then calls the kernel

#include "function_host.cuh"

int main(){

    function_host();;

}

Here the function which is called. And here is my first confusion, in the beginning I tried adding the extern “C” and the function name, to tell the compiler that it should be compiled as C code, However it just works without the extern “C”.

#include "test_kernel.cu"

void function_host(){

    dim3 grid=1;

    dim3 threads=1;

    test_kernel<<<grid,threads>>>();

}

And here my kernel funtion test_kernel.cu.

#ifndef _TEST__KERNEL_CU_

#define _TEST_KERNEL_CU_

__global__ void test_kernel(){

    int g[10];

    for(int i =0; i<10; i++){

        g[i]=i;

    }    

}

#endif

If I start it like that, I’ll get the error code ‘multiple definition of globfunc_Z11test_kernelv’. However if I change ‘global’ to ‘device’ it works fine. So even I don’t put a 'global 'to the ‘function_host’, it seemed to be considered as the kernel!? In the NVIDIA programming guide it is said that any function with no extra declaration is handled as host! So why does that happen?

Apart from that, when is it useful to integrate the definition of a function in a header file and when in the normal source file. In my C++ code I made a header file for each source file containing all the definitions, however I already realized, that this doesn’t work with CUDA.

So I guess this is really a basic understanding problem of how to define functions, but I already spent some hours in figuring it out and I don’t get it.

thanks for any help!!

tom

P:D: I am working on Linux Redhat 4.3, using Netbeans as IDE for C++ and Cuda.

typo: add or remove one _ between TEST and KERNEL

Sorry for the typo, this was a copy-and-paste error, which isn’t in my original source code. So the problem stays the same.

try it :)

#include “function_host.cuh”
extern “C”
void function_host();
int main(){
function_host();;
}

and

include “test_kernel.cu”
extern “C”
void function_host();

void function_host(){
dim3 grid=1;
dim3 threads=1;
test_kernel<<<grid,threads>>>();
}

Hi Ced,

I tried it the way you proposed, however it still gives me the error of multiple definition of my global function. I finally find out, when I define my global funtion in an extra header file, it works. I don’t know why, but it works like that…

What I am still corious about is the fact, that when I erase the extern “C” from my funcion_host() and define my test_kernel() as device the program also works, however regarding the definition of the function calls, a __device function can only be called from the device and not from the host. So it seems that even I don’t define the void function_host() as global, it is compiled as global.

Strange, I guess I need some practice to get used to the mix of defintions of host, global and device.

Nevertheless, thanks for the help!

tom

Uh, did you include that header from multiple files? Then of course the function gets generated each time (and just in case you wonder, no those multiple-inclusion guards can not help against that).

I don’t really see why you include the .cu file instead of linking to it, but if you insist on putting stuff like functions or data in included files, you should declare them as “static”, thus limiting their visibility. Advantage: it compiles. Disadvatage: the code will end up multiple times in your binary, causing useless bloat.