Function Call Overhead

Is function call overhead significant in OpenCL-C or does the compiler just inline calls anyway since there is no real stack? Is it best to inline code by hand that otherwise would be written as a separate (non-kernel) function?

Thanks in advance.

Is function call overhead significant in OpenCL-C or does the compiler just inline calls anyway since there is no real stack? Is it best to inline code by hand that otherwise would be written as a separate (non-kernel) function?

Thanks in advance.

I think all the function calls are inlined.

I think all the function calls are inlined.

How should parameter passing be done in order to get best performance?

float4 someFunction(float4 arg1, float4 arg2);

__kernel ..... {

  float4 a=..., b=...;

  float4 c = someFunction(a,b);

}

or

float4 someFunction(float4* arg1, float4* arg2);

__kernel ..... {

  float4 a=..., b=...;

  float4 c = someFunction(&a,&b);

}

Does it matter? My theory was to avoid a copy of the parameters by passing as pointers, but not sure what will happen in the background with these pointers.

How should parameter passing be done in order to get best performance?

float4 someFunction(float4 arg1, float4 arg2);

__kernel ..... {

  float4 a=..., b=...;

  float4 c = someFunction(a,b);

}

or

float4 someFunction(float4* arg1, float4* arg2);

__kernel ..... {

  float4 a=..., b=...;

  float4 c = someFunction(&a,&b);

}

Does it matter? My theory was to avoid a copy of the parameters by passing as pointers, but not sure what will happen in the background with these pointers.

In theory, it’s better to use pointers but, in practise, use the profiler or regcount :P

In theory, it’s better to use pointers but, in practise, use the profiler or regcount :P