Hi, I noticed a real boost in performance when using float3 instead of float *3, in the documentation its mentioned that a float3 read from global memory takes the same time as a float. For me was critical since most of the time was spent on reading and writing the data. I still want to try and use texture memory but haven’t gotten around to it yet, and sadly there is no code example of how to use it :(
Hope this helps
Eri