The lack of dynamic block replacement in hardware is pinching these algorithms… See URL above… The financial algos start with big arrays and taper away like a triangle as computation proceeds – exposing latencies… Anyway, we can always write code to bypass it…
Posted upto date results for binomial and trinomial option pricing!
2 approaches have been tried for trinomial pricing. Both methods have been profiled! Both the approaches have yielded good results and the speed up hovers around 96x for trinomial. This one has bettered the previous 76x.
Kindly look @ the attachment in the first post - I have replaced it with the most recent numbers!
With an enhanced parallel algorithm, we are getting around 100x to 200x performance for the binomial option pricing algorithm against -O2 optimized CPU code.
The trinomial speedup for european is around 48x now
GPU: 8800 GTX, CPU: AMD Athlon 2.41GHz
A detailed speedup summary for european and american option pricing will be posted within 2 weeks time frame.