I’m using Optix on a Windows7 system, and I’m aware of the display driver timeout problem.
Not wanting to use multiple graphics cards or changing the operating system, I was limited to the
‘split your launch’ approach.
But what is the correct way to split a single launch?
Using SampleScene/GLUTDisplay to render my images, a single launch approach would look like this:
void Scene::trace( ... ) {
//other stuff like camera changes
//...
m_context->launch( entry_point, width, height);
}
Trying to split it up I first did something like this:
void Scene::trace( ... ) {
//other stuff like camera changes
//...
//pretend 'launches' is a vector filled with individual regions for each launch
for (unsigned int i = 0; i < launches.size(), i++) {
m_context->launch( entry_point, launches[i].width, launches[i].height);
}
}
When that did not work out (still killed the display driver), I tried something that simply had to work in my mind:
void Scene::trace( ... ) {
//other stuff like camera change
//...
m_current_launch = m_current_launch % launches.size();
m_context->launch( entry_point, launches[m_current_launch].width, launches[m_current_launch].height);
m_current_launch++;
}
Basically this draws one tile after another which seems to work at first glance, but still crashes the display driver when the overall workload is increased. Even decreasing the size of each launch call to an unreasonable small amount does not work (besides the overhead of these launches being ridiculous at that point).
To avoid some confusion:
- said launch takes up 98%+ of processing time each time trace is called
- launches take more time on well lit tiles (expected)
- rendering the whole image is possible if overall workload is small
- current goal is to render the Sponza scene with about 50k VPLs per trace call (does not seem over the top to me)
The crash does not happen right away, but on specific tiles.
Those tiles are very well lit so they might take longer than others, but is there some other reason than a timeout that could cause the display driver to crash (stuff like division by 0 etc)?
Is my approach valid to split the workload?
Would the first approach suffice or do i need the second one?
Am i missing something?
Some people refered to rtContextSetTimeoutCallback() but I dont know if that is needed and how its used (what is the callback function supposed to do when its called?).