Hello,
I have a function, Which is on CPU. and its calling environment is like this.
void CPUFun( … )
{
//allocating nearly 15 pointers with cudaMalloc();
//calling GPU functions 7 GPU functions here…
// ALL GPU function configs are <<<1,1>>>
GPUFun1 <<<1,1>>> ( … );
… upto …
GPUFun7<<<1,1>>> ( … );
}
int main()
{
// call CPUFun()
STARTTIME; // a macro
CPUFun( … );
STOPTIME; // a macro
Printf(" CPUFun() execution time", time );
}
among 7 GPU functions, 6 GPU functions are taking not more than 1ms execution time and one GPU function is taking 1.5 ms execution time.
but When I print the CPUFun() execution time, it is showing that 30+ ms.
Is this CPUFun() execution fine?
I think, theCPUFun() execution time should be not more than 10ms.
please help in this…