interface for ray-tracing on Fermi? can ray-triangle intersection test accelerated by Fermi

I am porting a ray-tracing code I wrote from C to OpenCL/CUDA. I remember when Fermi came out, the new feature list included “real time ray-tracing”. I am wondering how does this work on Fermi? does cuda provide kernels for out-of-box ray-tracing functionalities? or one has to write their own?

The core of my ray-tracer is ray-triangle intersection test (the code is not for computer graphics, but for scientific computing). Are there any faster alternatives in cuda on Fermi other than wrapping my own C code to a cuda kernel?

thank you

There’s no special API for raytracing included in CUDA. There are libraries you can get implemented in CUDA, like NVIDIA OptiX (http://developer.nvidia.com/object/optix-home.html) or you can write your own kernels.

I believe OptiX only works on Quadro and Tesla cards, not on GeForces.

OptiX is supported on Fermi-based GeForces [1].

[1] http://developer.nvidia.com/object/optix-home.html

Cuda does not include a raytracer, use OptiX or write your own intersector in Cuda C( Which is easy basically, the trick is to optimize it well;) )

This (graphics) raytracer is pretty fast:
http://www.tml.tkk.fi/~timo/publications/a…09hpg_paper.pdf

Here is a discussion about intersectors:
http://ompf.org/forum/viewtopic.php?f=16&t=1565

thank you all for the useful info. Does the “intersector” here include something like “Plucker-coordinates-based” ray-triangle test kernel? If I write my own cuda code to do this test (which is straightforward), will I miss any special hardware-acceleration that OptiX has?

In some of the links, people are talking about billions of triangles per second, was that achieved by a highly optimized cuda kernel, or some hardware acceleration?

thanks again

thank you all for the useful info. Does the “intersector” here include something like “Plucker-coordinates-based” ray-triangle test kernel? If I write my own cuda code to do this test (which is straightforward), will I miss any special hardware-acceleration that OptiX has?

In some of the links, people are talking about billions of triangles per second, was that achieved by a highly optimized cuda kernel, or some hardware acceleration?

thanks again

I think there IS special API for raytracing in CUDA. If you take a look at OptiX source and binary, there is _rt_trace, _rt_buffer_get, _rt_buffer_get_size, _rt_potential_intersection…which are not defined/implemented by OptiX itself. The only reasonable explanation is that those functions are implemented by the kernel driver itself. I haven’t tried to write a PTX that calls those kernel function directly, but anyway there’s some magic behind.

I think there IS special API for raytracing in CUDA. If you take a look at OptiX source and binary, there is _rt_trace, _rt_buffer_get, _rt_buffer_get_size, _rt_potential_intersection…which are not defined/implemented by OptiX itself. The only reasonable explanation is that those functions are implemented by the kernel driver itself. I haven’t tried to write a PTX that calls those kernel function directly, but anyway there’s some magic behind.

There is no special hardware support used in Optix (it is written in CUDA), but it is very well optimized.

There is no special hardware support used in Optix (it is written in CUDA), but it is very well optimized.