Community,
I am running a system which processes a stack of images.
I crop each frame as it loads inside of ROUTINE and assign a portion of each image to each of the processors.
I have threaded the system such that each GPU is tied to a single CPU thread.
I start the thread and end the thread like the example:
for( int device = 0; device < num_gpus; device++ )
{
threadID[device] = cutStartThread( ROUTINE )
}
cutWaitForThreads(threadID, num_gpus);
How would I be able to send the data back to the CPU without exiting the ROUTINE?
Is there a way for me to send some flags once “cudamemcpy” is called in a thread such that the CPU knows?
I have no clue how to do this.
I dont want to re-thread and recall the ROUTINE for each frame because there are many allocations of memory which bogs down run time.
HELP!
Thanks.
BTW im a novice at multithreading.