About sync of all blocks

I’d polled some post and get the method to sync all blocks is to terminate the kernel and start a new one.
If i want to do some image process on a source image like
process 1: average
process 2: laplacian
process 3: Sobel filter

If I do it like.
1.copy soure data from host ot device
2.doing process like
printf(“Process1 End.\n”);
printf(“Process2 End.\n”);
3.copy form device to host

can the code reach the goal to terminate the previous one and start a new one.
And, can it make the whole image process I mensioned?

yep, thats the way to do it