CUDA and raytracing

Hello again,

I was implementing a raytracer using CUDA ( just primary rays ). My scenes contain between 1M-6M of polygons ( static scene ).

How would you set your rays / data?

  1. Rays as kernel data. Mesh triangle set as a read-only texture.

  2. Mesh triangles as kernel , rays as read-only texture. This can be considered as a brute force approach. Just test one triangle vs all the rays. Then send results back to CPU and sort closestHit, etc…

What would you think is better?

  1. Use stackless spatial structures.
  2. Use brute-force data streaming.

What are your experiences with CUDA and raytracing?

And a question… does Gelato use CUDA to make raytracing? How is raytracing done there?