I’ve ported and tested a chunk of code from C++ to C as a precursor to running it on CUDA but I’m now having problem getting it to compile.
An example would best illustrate the problem:
[indent]
I have my kernel defined in the file cudaKernel.cu.
This calls a function foo() which is declared and defined in foo.h and foo.c.
It compiles and runs fine in emulation mode but I get ‘External calls are not supported’ errors when I compile in ‘device’ mode.
[/indent]
BTW - this test function has almost nothing in it for test purposed i.e. int x = 1 * 2.
I’m expecting the compiler to inline foo() with no problems but it’s beginning to look like I need to have all the functions which the kernel function depends on defined in the same file as the kernel function. Is that right?
Maybe I’m missing something but this seems nuts to me and is a big obstacle to being able to run code on CUDA or CPU using a single code line. Ditto the need to add device prefixes. Why is this necessary? Can’t the compiler figure this out itself?
TIA
Mark
You’re probably not going to get any benefit trying to run CPU code on CUDA. You need to parallelize your algorithm. device is pretty minimal and clean imho. You can use device and host together.
Thanks for the reply but it didn’t really answer my question.
To your points though:
-
parallelising your algorithm is not the only way to get a performance speedup: you can also run your algorithm in parallel e.g. if I have 10000 options to price I can price 240 simultaneously without any parallelisation of the pricing algorithm;
-
for the moment I’ve hacked all the code into one file and currently see a speedup of around 16 over the CPU - that’s without any tuning and using an AoS data layout which will be resulting in horrible memory accesses.
best regards
Mark
The CUDA compiler doesn’t currently support linking device code, so all the device functions used by each kernel must be in a single file. You can get around this to a certain extent by putting common functions into header files and including them in your .cu file.
Are there plans to provide linking of device code in the future?