I was trying to optimize my code by trying to do as much computation (even the smaller for and if loops) on to device.
So Essentially i have device function which returns a bool value (stating success or fail) and is called within the global function…As I cant do a print f on the device ,is it feasible to copy the return value of the device function.
Here is the skeleton of the code :
Device
device bool checkcriterion()
global kernel_func
{
bool status;
some array processing logic;
status = checkcriterion; // device function called within kernel function
}
**HOST
if status =1 pass
else
fail
Also ,would like to know if its a feasible approach.
You need to pass ‘status’ to the kernel function as a device pointer. If you have a device which can map pinned memory to device memory, then I would do that for such small values.