overall time consumption computation how to compute how much time my GPU code is consuming ?

Hi,
i’m working on a encryption cracking project . i’ve implemented my algo using CUDA and its working just fine
now i need to know how much time it consumes from both GPU and CPU . since i’m not familiar with some concepts my questions may look stupid
i’m using 3 GTX 295 on windows

1- each processor core has a clock rate , right ? what does it mean ? what’s the actual meaning of one processor core clock rate ?
2- CUDA’s profiler shows my also is occupying 0.25 of Device0 . what does it mean ?
3- is it possible to run several copies of my also on one single core ? please explain
4- profiler says the main function of my also consumes X microsecond of GPU and Y microsecond of CPU . does it mean one single copy of my also has that rate ? yes my code just executes one single copy of the algo for one time only . GPU and CPU time means on each execution both GPU and CPU are going to be involved ? and if that’s true , if i run hundreds of my also on GPU , all of them are gonna need clock from CPU ? if i’m wrong please explain
5- in general , how to compute one single execution time of one copy of an algo on GPU ? is GPU’s freq involved ? is CPU’s freq involved ?
6- in general , each copy of a kernel runs on one processor core . right ? is it possible to use one processor to compute several copies of a kernel at the same time ?

i know that its a huge list of questions and i know some of them are off-topic . well , i’m new . give me initial hints i will follow them myself

Thanks in Advance :thumbup: