The best way to represent 10M spherical particles?

Hello,

I’m creating a global illumination system for tons of spherical particles with non-zero radius, say, there are 10M of them.

There seems to be a lot of ways to achieve this:

a) Generating a triangle mesh of these particles. There may be 1G triangles if there are 100 triangles per sphere. However, 1G triangles may not fit inside GPU memory.

b) Generating a triangle mesh for a single particle, then make 10M instances. This sounds more reasonable but is there an acceleration structure that supports so many instances? (Maybe a two-level one?)

Theoretically, the best solution seems to do instancing on a bunch of analytical (instead of triangulated) spheres. Is there an easy way to do so?

Thanks in advance for your help.

Best,
Yuanming Hu

Yes, but don’t instance them via transforms!
Read these threads on similar topics:
https://devtalk.nvidia.com/default/topic/1026659/optix/interactive-geometry-replacement/
https://devtalk.nvidia.com/default/topic/1027203/?comment=5226059

I especially recommend to follow the link to the GTC presentation on in the second link to give you an impression what’s possible!

If putting 10M parametric spheres (float4, .xyz = center, .w = radius) into one buffer is too big, you can also split them into individual GeometryGroup + Acceleration, like for example 1 million spheres each.
If you order this spatially, that might even help the BVH traversal.

OptiX contains an example for a sphere intersection routine. You find all examples of intersection routines by searching the *.cu files in the OptiX SDK for “rtPotentialIntersection”.
That’s just solving the quadratic equation and if your spheres are opaque, you can even speed that up by only using the smaller of the two solutions (the one subtracting the square root).

Make sure you’re not triggering the specialized triangle builder inadvertently.
https://devtalk.nvidia.com/default/topic/994092/?comment=5085657

Thanks so much! I followed your suggestions and it worked.

A (crazy) scene 19.5M spheres: [url]https://ibb.co/cHYGFJ[/url]

Btw, I spent a lot of time (almost 2h, unfortunately) debugging the shaders. I find any error (such as forgetting to specify normal, or specifying an unnecessary normal when there’s no rtReportIntersection.) leads to the following error:

Unknown error (Details: Function “RTresult _rtContextLaunch2D(RTcontext, unsigned int, RTsize, RTsize)” caught exception: Assertion failed: “!m_enteredFromAPI : Memory manager already entered from API”, file: /root/sw/wsapps/raytracing/rtsdk/rel5.1/src/Memory/MemoryManager.cpp, line: 963

As you can see, this is not really diagnostic. Is there a way to get a more detailed error message?

On my 1080Ti, I can fit 39M particles with 11G VRAM… amazing…

Now we’re talking! Very nice.

Actually I would expect that you can fit more particles into 11GB. (For triangles my very coarse rule of thumb is 10 MTriangles per 1 GB.)
If you’re using the Trvbh builder, look into its chunk settings (acceleration properties) to possibly use less VRAM during build time.
Also splitting into multiple GeometryGroups as mentioned before might help to overcome limits due to VRAM fragmentation.

Assuming the spheres are not ordered in some grid, the striped artifacts in the back of the image are possibly from shadow acne due to self intersections.
You would need to increase the “scene epsilon” value which avoid these self intersections when starting a new ray from a hit surface. Or use a different more robust method to avoid these.

I have not seen that error myself.
I could imagine that the compiler inside OptiX can get confused when not finding a rtReportIntersection after a rtPotentialIntersection.
If you can provide a minimal and complete failing shader code we can take a look.

The attributes themselves must only be calculated between rtPotentialIntersection and rtReportIntersection!
If you forget to declare an attribute which is used in an anyhit or closesthit program, OptiX will report that mismatch.
If you declare and calculate more attributes inside the intersection program than you use inside the anyhit and closest hit programs, OptiX will automatically remove the unused ones as dead code.

If you forget to initialize data in the per ray payload, the resulting error is normally small 8x4 sized screen space rectangles of corruption in your image.

For other debug methods have a look into the OptiX introduction examples. Links here:
[url]https://devtalk.nvidia.com/default/topic/998546/optix/optix-advanced-samples-on-github/[/url]
I implemented my own “rtAssert” function which uses rtThrow and user exceptions. Search the source code for the define USE_DEBUG_EXCEPTIONS to find some more debug tricks.
Note that enabling exception handling in OptiX impacts the runtime performance dramatically since version 4.0.