Hi
I want to know using kernel function with host function simultaneously to reduce elapsed time.
example
for(;;)
{
//
computation in Host (elapsed time 400us)
//
//
computation in Kernel (elapsed time 400us)
//
}
if i want to make this code work with under 500us(1 cycle)
How can I do ?
Thanks.