Noob here and I have a simple question, can we call a device kernel and carry on with CPU computation without waiting for the GPU to return? Something like the right part of the attached image.
Is this possible on TESLA? FERMI?
Any simple examples?