update: It works if I also pass nvcc a .cu file (any .cu file, even an empty one). I.e. if I create an empty file empty.cu, and my previousy posted code is test.c, then:
nvcc -deviceemu test.c empty.cu creates an executable that works correctly.
So it appears that for nvcc to really give you device-emulated code, there is a requirement that at least one of the input files be a .cu file. (and assuming you’re going to be calling a kernel at some point, at least one of the input files must be a .cu file anyway, so this isn’t really a big deal).
If you are on Linux, creating a symlink is also a solution:
ln -s foo.c bar.cu
That’s what I do in a multi-platform project where the algorithms have to be shared amongst standard i386 targets, Cell and CUDA.
I also have to ensure that I get the right function modifiers, I do this with the preprocessor:
/* If being compiled by the NVCC compiler, ensure algorithms
* get proper modifiers */
#if (defined(__CUDACC__))
#define _CUDA_FUNC_MODIFIER __device__ __host__
#else
#define _CUDA_FUNC_MODIFIER
#endif
_CUDA_FUNC_MODIFIER int devillers(const triangle_t *t1, const triangle_t *t2) {
...