I just read through the reference of the “optix::ScopedObj Class”. I was wondering if there is a way to access the declared variables of Classes which are derived from “optix::ScopedObj Class Reference” (e.g rtProgramObj or rtMaterialObj) from device side (.cu files).
I have only accessed variables on device side, which were gloabally defined by the context on host side (context[“my_variable”]->set_float3(…))
What I am trying to do is to implement multiple cameras with a camera class. For each camera instance, a new entry point is defined. Since I am using the same camera model for each camera, but wich different parameters, I was hoping to be able to set these variables for each regeneration program.
If the camera model is always the same, there is no need for multiple entry points, that’s just wasted runtime performance. Simply change the parameters.
Or when that are too many, store all of them in a user format buffer as an array of parameter blocks and switch the index which selects which block of parameters to use.
If your cameras are just generating different projections, but use the same rendering method, there is also no need for separate entry points.
You could implement the generation of primaray rays with different projections as buffer of bindless callable programs IDs, which is just a function table in OptiX terms.
Then the switching of the camera projection would be as simple as changing an index variable.
That works similarly well for other fixed function code like BSDF sampling and evaluation or light sampling routines.
By implementing multiple entry points, I was hoping to speed up the rendering process itself.
Do several entry points not render simultaniously? I am willing to render a static scene with about 3000 different cameras (entry points) at once. Is there a way to achieve that?
No, separate rtContextLaunch*() calls do not run in parallel. It’s a blocking call and synchronous in all shipping OptiX versions. The kernel running on the individual launch indices of each rtContextLaunch will be executed in parallel.
3000 OptiX entry points is most likely not going to work for anything at all. Don’t even try!
That will definitely not increase performance in any way. Rather the contrary, the compile time will skyrocket, including 3000 compiles when entering each new entry point and I doubt the resulting kernel will be reasonably sized.
If you need to render 3000 views of a static model with the same renderer but different views, then the first approach would be to implement that as 3000 individual launches with different camera parameters.
Once you determined the performance of that, you might be able to combine work into fewer launches.
Depending on the OS, driver model, and hardware you’re using, there are some limits which would affect that combination of the work into fewer launches.
For example, if those 3000 launches would take under the 2 seconds Timeout Detection and Recovery (TDR) limit under the Windows Display Driver Model (WDDM), which is unlikely, or if you’re using a dedicated Tesla or Quadro board in Tesla Compute Cluster (TCC) driver mode, then you could actually render all these 3000 images in one launch, if the resulting data fits into VRAM, which is also unlikely if this is not really, really low-res in size and data format.