Tiny but painful error with host/device function __device__ cannot be called from __host__ __device_

How could you miss this little case? :P

__device__ void fun1() {

}

__host__ __device__ void fun2() {

#ifdef __CUDA_ARCH__

	fun1();

#endif

}

It works perfectly for host compilation, but for device compilation it ends with an error:

error: calling a device function from a host device function is not allowed

Hope this is going to be fixed very soon. Right now I have to make functions host device when they are never meant to run on a host…

Edit: Epic fail?

This fails to compile too!

__device__ void devFun() {}

__host__ void hstFun() {}

__host__ __device__ void mainFun() {

#ifdef __CUDA_ARCH__

//	devFun();   (this is comented out)

#else

	hstFun();

#endif

}

with error:

error: calling a host function from a device/global function is only allowed in device emulation model

So either I am doing something completly wrong or this CUDA_ARCH macro to differentiate from host/device code does not work at all!

Deadlines are coming and I am really screwed up with this…

A possible workaround for host device functions:

#ifdef __CUDA_ARCH__

#define HOSTDEVICE __device__

#define HCALL(fun)

#define DCALL(fun) fun

#else

#define HOSTDEVICE __host__

#define HCALL(fun) fun

#define DCALL(fun) 

#endif

If you are calling the device version you must pack it with DCALL(…) macro and if you call hsot version, pack it with HCALL(…)

The host compiler sees the function as purely host one, while device compiler sees it as purely device one.

Still I would be grateful if someone could acknowledge the problem and confirm that it is going to be corrected in next release…

So? No comments? No ideas how it could be resolved better?
And no promisses for the next toolkit?

Bingo!

You cannot call a device function from a host function. They run on different processors. They are compiled to different assembly codes.

If a function is to be compiled for both device and host, it must be compiled twice.

In the host compilation, the call to the device only function makes no sense.

But I am not trying to call a device function from a host function!

It is just what the complier thinks that might happen, not realising the CUDA_ARCH guarding.

Read my example again, please!

Haa… Dats a tricky thing for the compiler to handle…

May be, you can try “#ifndef” of Host-compiler’s define… Depends.,., May work out…

I am happy to note that with CUDA 3.1 this problem is resolved.
Thank you!

I am happy to note that with CUDA 3.1 this problem is resolved.
Thank you!