Hi to all of you,
I use OptiX for an ordinary raytracer. Until now i could fix every problem by myself or using google.
I load a scenegraph and build an OptiX-Context accordingly:
At the root I use a RTGroup and an attached BVH-Accelerator.
Below this root is an arbitrary Tree of RTGroups (nodes), each with an attached BVH-Accelerator.
At the leafs I use RTGeometrygroups, again each with an attached BVH-Accelerator and a RTGeometry.
This system works, i get the desired frames.
Since all the geometry consists of triangles I want to use the accelerators for triangles (TriangleKdTree/KdTree) at the lowest level (RTGeometrygroup). Therefore this is the only thing i changed from the working system.
I set the vertex and index buffers as follows:
rtAccelerationCreate(m_context, &m_o_accel);
rtAccelerationSetBuilder(m_o_accel, "TriangleKdTree");
rtAccelerationSetTraverser(m_o_accel, "KdTree");
rtAccelerationSetProperty(m_o_accel, "vertex_buffer_name", "vertexBuffer");
rtAccelerationSetProperty(m_o_accel, "vertex_buffer_stride", "0");
rtAccelerationSetProperty(m_o_accel, "index_buffer_name", "vertexIndexBuffer");
rtAccelerationSetProperty(m_o_accel, "index_buffer_stride", "0");
But I get the following error:
Unknown error (Details: Function "RTresult _rtContextLaunch2D(RTcontext_api*,
unsigned int, RTsize, RTsize)" caught exception: Encountered a CUDA error:
Kernel launch returned (702): Launch timeout, [6619200])
If I am not mistaken, this means that the call to rtContextLaunch2D takes too long.
Therefore I tried to minimize the work:
- 6x6 Pixel
- no recursion but to the lightsources --> no shadows, transparencies, refractions, reflections
- 589 vertices in 5 buffers
- 3348 vertex indices in 5 buffers
- before first real launch a call to rtContextLaunch2D(context, 0, 0) --> compiles in 5.5-6.5s
The system is as follows:
- SUSE Linux Enterprise Desktop 11 (x86_64)
- Quadro 6000
- Nvidia driver 304.54
- Cuda 5.0
- OptiX 3.0.1
I tried manipulating the buffers length and names, getting validation error, to assure that they are correct. Those are the same buffers I use in the intersection test:
RT_PROGRAM void mesh_intersect(int primIdx){
int v_id0 = vertexIndexBuffer[primIdx * 3 + 0];
int v_id1 = vertexIndexBuffer[primIdx * 3 + 1];
int v_id2 = vertexIndexBuffer[primIdx * 3 + 2];
float3 p0 = vertexBuffer[v_id0];
float3 p1 = vertexBuffer[v_id1];
float3 p2 = vertexBuffer[v_id2];
//intersections tests follow
}
Is it even possible to mix the different acceleration structures for the different tiers of the tree?
I hope someone can point me in the right direction to fix this problem!
If you need further information I will gladly provide it (hopefully not a smallest working example, which would be rather hard to make because the application uses MPI to work on a distributed system)
Greetings
Frieder