I have been reading a bit on doing GPU accelerated raycasting on CUDA architecture specifically this recent thesis (http://ivokabel.wz.cz/pages/myWorks/Ivo_Pavlik_-_Thesis.pdf), and probably the first paper on doing this on CUDA (http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=4634648). I was wondering why the the performance results never show the comparison to a naive implementation (without using any acceleration structure) because from my experience using any acceleration structure on GPUs, the performance is worse and to get the performance benefits, you need to tweak your kernels to get optimum performance. One more thing, just looking at the performance results, the max frame rates are around 20 - 25 fps for the approach given in the Pavlik thesis. Whereas the basic volume shipped with CUDA sdk gives around 30 fps unoptimized and with a little bit of optimizations can easily give arund 60 fps.
Does anyone else feels the same?