I am sharing the GPU speedups for Binomial and trinomial tree implementation for option-pricing.
Parallel Algorithm Courtesy: Mr. Alexandros Gerbessiotis’ paper on parallelizing binomial & trinomial tree option pricing! Links to paper available inside the XLS!
Binomial Tree speedup is around 125x against un-optimized CPU code. It hovers around 65 to 85x against optimized (-O2) CPU code!
Trinomial ranges around 96x against un-optimized CPU code. It is around 27x against optimized (-O2) CPU code!
The CPU used was AMD Athlon running at 2.41GHz. GPU is 8800GTX.
The speedup factors do NOT include “memcopy” times (for both input and output)
Subtract some 10x to 15x for it to know the really real speedups. More in the note section of the XL sheet.
The attached XLS has 3 sheets inside! Two of them are for trinomial (2 different approaches to parallelizing) and one of them is for binomial! Note that the results published in the XL sheets are only for comparisons against un-optimized code! We will be publishing results soon with optimized CPU code!
GPUForFinance.xls (74 KB)