removed
I’m now trying to create a project, which has CMAKE capabilities.
thanks to tonycdy1991 I finally could create a CMAKE project, which should work for you. Yet I did not apply all the things to it, which I made in my other “hard-coded” one. So it may take a few days until I can submit it.
Minimizing it leads to a completely different run-time behaviour, which I already encountered several times now.
Btw. OptiX (and especially the denoiser) is really awesome! After I watched the SIGGRAPH 2017 videos I even now finally understand why it can use an albedo buffer. Really great!
First, I can reproduce the cudaDriver().CuMemcpyDtoHAsync() illegal address error with the first OAC or the MiniTest.zip slightly adapted to run in my environment.
At this time I’m unable to say if this error comes from your unexpected raytype usage, or from the fact that you’re constantly exchanging the top-level scene graph node, or toggling the miss program back and forth.
Here would be my recommendations to try to get the program into a state which should work and actually perform much better:
You’re mixing two different renderers without adapting the different ray types they use in each of the original examples. Means you have only declared three ray types in this context and the photon mapper is using 0, 1, 2 as defined in the enum RayTypes. The path tracer also uses 0 and 1 setup in InitDiffuseRayTrayer() but with a different meaning defined be by the different rtPayload. I would not do that.
To stay consistent you should use separate ray types for either algorithms, means five ray types in this case.
That needs changes in the material program assignments and the rtTrace() calls.
I normally hardcode all ray types with #defines and omit needless variable declarations for these. Saves variables and global accesses.
You seem to try to resolve that by constantly exchanging the top-level scene graph node, but that is going to be really slow and should be unnecessary.
Then you’re constantly reassigning the miss program, because of the shared ray type 0.
I have not looked further into what the idea of the program flow should achieve, but this is absolutely not how an OptiX program should be structured to be fast and efficient.
You could avoid all that switching completely with proper material assignments which handle all ray types as desired for example. You could also have two top level nodes and each raygeneration algorithm would use one. Or you could have the whole scene under a Selector node and the ray generation entry points would set a field in the rtPayload which selects which child to traverse.
The goal here should be to not reconstruct the scene topology every launch of the different algorithms and not incur expensive recompiles.
Other than that
- Do not declare any variables between launches! That invokes an expensive recompile.
- Do not change the scene topology between launches! That incurs a validation step, an acceleration build, and potentially an expensive recompile.
- Do not switch the miss program back and forth on the one ray type. That invokes an expensive recompile.
There are faster ways to switch functions instantaneously via function tables which can be implemented with buffers of bindless callable program IDs in OptiX. - Please avoid macros in C++ code. That’s not debuggable.
- Get rid of the #define width WIDTH and #define height HEIGHT.
- The C++ wrapper function addChild() already adds 1 to the child count. No need to set it to 1 afterwards in your macro.
- There is a duplicate definition of #define ExceptionDepth in ppm_rtpass2.cu.
- Consider avoiding tabs in source code based on OptiX example code.
Let’s assume there is still a bug in OptiX. All these recommendations are meant to unblock you now and direct your implementation into a performant application architecture. I simply expect this to work when doing all these things.
Taking some OptiX SDK examples which are only meant to work in isolation and plugging them together without complete overview about what really happens can lead to such unpleasant experiences.
I would recommend to take a step back and consider what you’re really trying to achieve in your application, then architect your source code from scratch incrementally, from minimal working code to more working code to be able to isolate any issue in each development step, until the implementation fulfills the requirements most efficiently.
That requires a thorough understanding of all OptiX features and my recommendations how to learn from the examples to apply this knowledge in own applications are listed here (slightly outdated version, but still mostly applicable): [url]Tutorials & Webcasts - OptiX - NVIDIA Developer Forums
Hi Detlef,
thank you for your detailed answer.
I again start from the beginning and add all small working pieces together now.
I completely started from zero again and have now a project,
which runs a whitted ray tracer and path tracer independent perfectly.
if denoiser added (when both ray tracers are active), yet it crashes: (716): Misaligned address
Source code project I sent through private message.
For now (due to time reasons) I will discard further tests with 2 ray tracers, so that I can focus on implementing the mesh+texture handling interface with my main app for path tracer + denoiser + depth handling.
Thank you Detlef for all your help.
I really appreciate that.
Thanks, got it downloaded and both reproduce the error.
I finally only use one TOP GROUP instead of a group hierarchy. That one group now contains all geometry groups. And now the mis-aligned exception is gone!!!