This is a really simple question but I’ve only just came across a program where i need it answered.

I have a piece of code in C++ that runs heavily in parallel. Part of this parallel code currently calls a function several times (just with different data). Can I just make a normal C function call to this function or do I have to assimilate the function into my CUDA function? If I have to do the latter it is going to make my code about 10times the length and really really ugly!


Just create a function “device int function(int a, int b, …)”. Keep in mind that the compiler is going to inline this function, so there is no function call stack.