synchronization and timing

when a kernel starts,is it synthronised or asynchronised with the host end or won’t it return until it finished

is it available to use cpu timing function to measure the elapse time of a kernel excution just like this

queryperformancecount(start)
kernel<<<>>>
cudasynthtread();
queryperformancecount(end)