Wow, that was quick!!. Thank you all three for replying.
@SPWorley: Agree you on data transfer operations slow down full application speedup, but I am focusing on processing speed of GPU. I get speed gain for a portion of my application. I am deliberately avoiding data transfer (I/O bound operations) as to bring fairness to CPU GPU comparison. My aim is to compare the processing times of GPU to CPU for full application. Tutorials and papers that I have come across, stress more on processing speed or bandwidth (as performance metric) . Full application speedup depends on nature of application. My application is not 100% parallelizable but the portions that are parallelized show local speed gain. This prompted me to go for Amdahls law to check the maximum overall speed to observed results.
I am waiting for the day when researchers make GPU an independent unit so that it no more relies on CPU for control.
@avidday: Thanks. It is clear :)
@mfatica: I may be wrong but I guess the formula Ts/Tp is for speed up due to pipeline execution at uni-processor.I agree with your view for the latter part.
I came across the paper by “M.D.Hill and M.R.Marty - Amdahl’s Law in Multicore era” IEEE 2008. There they are talking about symmetric , asymmetric multiprocessors .
I have NVIDIA Ge8800Gs that has 12MP with 8 core per MP = 96 cores.
So, I am assuming that my Ge8800GS is symmetric MP
According to the paper they have different formula for speed up.Can you comment on it ?Which one should I choose , the one given at Wiki i.e Overall speed = 1 / [(1-P) + P/S] or the one given at the paper titled above.
Thank you and help appreciated.