Error: External calls are not supported how put __device__ functions in another module?

Hello,

I want to outsource some auxiliary device functions to another module but I am constantly getting:

I am trying the following:

header:

__device__ void atomicAdd(float * dst, const float * src, float f);

source

__device__ void atomicAdd(float * dst, const float * src, float f) {

        for (int i = 0; i < nout; i++)

            atomicAdd(dst + i, src[i] * f);

    }

main.cu

#include "header"

...

I know there is no such thing as a cuda linker, but is there a way to do this? Or do I have to put all device functions into the same source file?

When I put the entire function into the header, I get:

For some reason the atomicAdd library function is not found inside the header. Do I need another cuda library header? I am using

#include <cutil_inline.h>

Kind Regards

Make sure to add a -arch sm_XX flag that supports atomics (I think it is sm_12 or higher).

Regarding linking, I’m curious how big of a deal this is for you. I’ve been toying with the idea of writing a cuda linker.

I am already using this flag and atomicAdd() works fine inside the source file. I just can’t use it inside the header for some reason… I for one would appreciate a cuda linker for reasons like this. Do you plan this as part of some thesis or just for a challenge?

This works for me on NVCC 4.0 beta using -arch sm_20. When I omit the flag I get:

"header.h(6): error: no instance of overloaded function “atomicAdd” matches the argument list

        argument types are: (float *, float)

Sometimes I get bored and I’m not very good at sports :) .

I renamed the function to atomicAddArr() and now it works, perhaps there was some naming confusion o.O

Respectable all the more :) Perhaps someone knows if nvidia is planning something like this anyway?