simple question

I run a kenel for calculating ,e.g. vector add. I hope that I can do other things when the GPU do the work. So I wonder that if some CUDA api can do something like starting the kenel program and going back immediately , and when the GPU finish the calculating , it will return a status for checking.
Thanks for help.

Yes, this is possible. Check programming manual for details :)

CUDA Kernel Launches from the CPU side are always “asychronous”.
See “cudaThreadSychronize” call which needs to be explicitly called for the CPU to wait for GPU completion. If this is NOT called, CPU keeps running in parallel with the GPU computation.