# Amdahl's law and GUP's How can CUDA break Amdahl's law?

I would like to know how CUDA and multi-parrallel GUP’s can break Amdahl’s law.

“For example, if 90% of the program can be parallelized, the theoretical maximum speed-up using parallel computing would be 10x no matter how many processors are used.” http://en.wikipedia.org/wiki/File:AmdahlsLaw.svg

~TheArchitect

Ehm, you can’t break that law. It’s theoretically solid.
You answered your own question, the higher the percentage of work on the GPU, the higher the theoretical limit for the speedup.
So if you put 99% of the work on the GPU, you have a theoretical speedup limit of 100x

N.

Note that the percentage of parallelism may be dependent on the data size.

Care to elaborate? How does this effect Amdahl’s Law? Can this be caculated?

~TheArchitect

That law by itself is a theoretical theorem, so it is unbreakable.

However, the input of the law, i.e. the percentage of parallelism, is manageable. Fortunately, for most GPU-friendly algorithms, this figure usually increases when we expand the data scale. Using many GPU cores to solve a small-scale problem may not help you so much, but they would quite fit large-scale problems (using the same program).

In addition, when more cores are integrated, the parallelism of the hardware may be increased (e.g. hiding memory access latency).

Hmm, I have seen applications featured in the CUDA applications section which claim x200 speed-up. They don’t base this on this law? And the claim that a Tesla Super Computer is x250 faster than a standard PC. Wouldn’t the max be somewhere around 19x @ 95% parallelizm for 960 cores?

~TheArchitect

:unsure:

Interesting!

Are there any general purpose algroythms for caculating these graphs? Such as Dijkstra’s algorithm.

~TheArchitect

True, it would be somewhere around 19x @ 95% parallelizm for 960 cores, which leads you to the conclusion that your estimate of the percentage on the GPU is wrong and should be at least 99.5% for a 200x speedup.

N.

Okay I found Gastafson’s Law for data parallizm regarding Amdahl’s Law after reading cvnguyen’s reply.

Speedup = (s + p ) / (s + p / N )

= 1 / (s + p / N )

With only 1% of the code in serial the max is 91%. Is it possible to have close to 0% with CUDA? John L. Gustafson acheived close to this:

http://www.scl.ameslab.gov/Publications/Gu…aw/Amdahls.html

What is the best that can be expected with CUDA? I have read other posts claiming around 80x for CUDA.

http://forums.nvidia.com/index.php?showtopic=79694

:mellow:

if you consider Monte Carlo , then with 1000 of cores
you will have roughly X1000 , specially if the time to compute a path is “big”…(the the time to