Context Launch pre processing

I have tried finding resources of what exactly is happening before the first context launch.
Unfortunately I have not found much.

I have simple program that creates 1M instances of spheres with random centers. These are put into a geometry group.
Then I tried doing a dummy context launch.
It seems to not matter what builder I use all of them yield the same time when I try to profile the first launch.
Note I validate and compile the context before the profiling the launch.
And the time seems to increase in a quadratic manner as I increase the number of instances.
What other operation are taking place before the first launch?
Does "“NoAccel” build anything ?

Best,
Sukumar

Depending on the OptiX version you use, the compile step is a NOP.
Or more precisely since OptiX 4 the very first launch of each entry point invokes a compilation of the kernel with all accessible programs at that time.

This is going to be improved in the future. See this GTC 2018 presentation for an outlook:
http://on-demand-gtc.gputechconf.com/gtc-quicklink/d3dVR

Additionally it builds the acceleration structures (AS) when they are dirty.

It could simply be that you’re just measuring the compile time of your OptiX device programs in this case and AS building is comparably quick.

I’m measuring these things as well in my OptiX Introduction examples in case you want to compare results. The geometry in these is runtime generated and you can increase the number of primitives as needed. Find links to the presentation and source code here:
https://devtalk.nvidia.com/default/topic/998546/optix/optix-advanced-samples-on-github/

>>I have simple program that creates 1M instances of spheres with random centers. These are put into a geometry group.<<

Do you really mean instances, like with a million transforms over them, or just all of them in a single Geometry node under a GeometryInstance under a GeometryGroup (which is the smallest possible OptiX scene graph with geometry in it)?
If former, that would be really slow. In that case please read this thread and all linked threads inside it:
https://devtalk.nvidia.com/default/topic/1036468/optix/the-best-way-to-represent-10m-spherical-particles-/

Thanks for the reply,

>> Do you really mean instances, like with a million transforms over them, or just all of them in a single Geometry node under a GeometryInstance under a GeometryGroup (which is the smallest possible OptiX scene graph with geometry in it)?

Yes, 1M sphere geometry each of them in a single Geometry instance put in single Geometry Group.

What is still puzzling is that when I use “NoAccel” and I increase the number of spheres, I still find the initial context launch time go up. Shouldn’t “NoAccel” give you a constant time regardless of how many instances I create (This is a dummy context launch) ?

“Yes, 1M sphere geometry each of them in a single Geometry instance put in single Geometry Group.”

That “each” in the sentence makes it still unclear how your scene graph looks like.
What is the number of Geometry, GeometryInstance, and Geometry nodes in your scene?

If the answer is 1, 1, and 1 then NoAccel should not add overhead.
Anyway, NoAccel doesn’t make sense for this use case, because each ray would be tested against all individual primitives. Just don’t care.

If the answer is 1M, 1M, and 1 then you’re doing it wrong. I would expect that the first launch has a lot of validation overhead then. Don’t use this scene layout! Follow the recommendations in the linked threads.

[i]>>If the answer is 1M, 1M, and 1 then you’re doing it wrong. I would expect that the first launch has a lot of validation overhead then. Don’t use this scene layout! Follow the recommendations in the linked threads.

[/quote]

[/i]

Yes this is what I have done.
I understand its the not the right way to go about it as pointed out in the thread.
I am just merely trying to understand the dependencies that context launch entails.

It is clear that the time is lost in validation of the geometry. That explains a lot !!!