Porting my renderer from C++ to CUDA - the speed gains and their cost. Weekend project - attempts to

I spent last weekend porting my SW-only renderer to CUDA. I blogged about my results here, where I also share my code in GPL. Note that this is not about “doing graphics with CUDA instead of OpenGL” - it was just an exercise to port some algorithms (that happen to be graphics algorithms) to CUDA, and my experience in doing so.

Feel free to comment on my efforts:

  • any positive feedback and/or algorithmic suggestions most welcome.
  • please refrain from bashing, I tried to be objective :-)

Kind regards,
Thanassis Tsiodras, Dr.-Ing.

I suggest to do more research before making any conlusions. Also, there are profilers those give divergent branches rates. You can use them to check.