deviceemu when you have .c files

Does deviceemu not work when you call runtime API functions from within .c files? i.e. all runtime API calls must be from within .cu files?

Compiling and running the following code using -deviceemu as a .c file does not work properly, but it does work properly as a .cu file.

#include <stdio.h>

#include <cuda_runtime_api.h>

int main() {

 int* p;

 cudaError_t err;

err = cudaMalloc((void**)(&p),4);

 if(err != cudaSuccess) printf("Error: %s\n",cudaGetErrorString(err));

printf("Done");

}

When I compile as nvcc -deviceemu test.c, then run the executable, I get “Error: feature is not yet implemented” followed by “Done”.

When I compile as nvcc -deviceemu test.cu, then run the executable, I just get the output “Done” (which is correct).

update: It works if I also pass nvcc a .cu file (any .cu file, even an empty one). I.e. if I create an empty file empty.cu, and my previousy posted code is test.c, then:
nvcc -deviceemu test.c empty.cu creates an executable that works correctly.

So it appears that for nvcc to really give you device-emulated code, there is a requirement that at least one of the input files be a .cu file. (and assuming you’re going to be calling a kernel at some point, at least one of the input files must be a .cu file anyway, so this isn’t really a big deal).

If you are on Linux, creating a symlink is also a solution:

ln -s foo.c bar.cu

That’s what I do in a multi-platform project where the algorithms have to be shared amongst standard i386 targets, Cell and CUDA.

I also have to ensure that I get the right function modifiers, I do this with the preprocessor:

/* If being compiled by the NVCC compiler, ensure algorithms

 * get proper modifiers */

#if (defined(__CUDACC__))

#define _CUDA_FUNC_MODIFIER __device__ __host__

#else

#define _CUDA_FUNC_MODIFIER 

#endif
_CUDA_FUNC_MODIFIER int devillers(const triangle_t *t1, const triangle_t *t2)  {

...