Tracing against 2 types of geometry in 1 pipeline

Dear moderators,

In my application I have 2 types of geometry that I want to be able to trace rays against within the same pipeline. The first type of geometry is the real scene geometry, the second type of “geometry” consists of a 3D grid of cells (of which some of them are left out). My goal is to send out a ray, test for intersection with the scene geometry, save the t-value of intersection, then send the ray out again, but this time test for intersection with the 3D cell-grid. I was thinking to do this by merging my geometry to build the acceleration structure and SBT data, making 2 ray types, and have 2 any hit shaders: The first any hit shader will ignore all hits with geometry of type 2, while the second any hit shader will ignore all hits with geometry of type 1. However, I am unsure if this is the proper way to do it. Any thoughts?

Thanks in advance,

  • Chuppa

Hi @Chuppa,

If I understand your situation correctly, I think you might not need more than 1 ray type or any special handling in your hit shaders. You will probably want 2 different any-hit shaders, one for each geometry type. You can freely mix different kinds of geometry in a single scene BVH, and the BVH will handle calling any-hit for all intersections and then returning the hit with the minimum t value and calling closest-hit for you. You do this by building a GAS with your real scene geometry (meshes?) and then build another separate GAS with your grid cells. If these are both inserted into an Instance Accel Structure (IAS), then the effect is that if they overlap, then OptiX will handle sending the same ray through both GASes; you don’t need to worry about sending two rays yourself.

When using any-hit shaders in this scenario, the ray will potentially traverse both GASes, if they overlap, and the any-hit shader will be called for every intersection reported along the ray. You might notice the any-hit shaders being called out of order and very far beyond the closest hit point, since they might (for example) traverse your grid cells first before traversing your meshes or other scene geometry.

Large overlapping BVH sub-trees can decrease performance, but won’t affect correctness. One way to improve performance, if you have the option, is to break your sub-trees into multiple smaller sub-trees. If you build multiple mesh instances, and multiple cell-sub-grid instances, and end up using many GASes instead of just two, it will reduce the amount of overlap and increase performance.


David.

1 Like

By the way, I should add that if you only need the closest hit, and you were only thinking of any-hit shaders as a way to handle ignoring certain geometry, then you can avoid using any-hit entirely, and only provide a closest-hit shader for each geometry type. Any-hit shaders can sometimes be expensive and you’re likely to get higher performance if you don’t use them when you don’t need them. If it is the case that you don’t require any-hit shaders, you can opt-out of any-hit invocations with one of the OPTIX_{RAY,GEOMETRY,INSTANCE}_FLAG_DISABLE_ANYHIT flags.


David.

1 Like

Hi David,

Thank you for the broad explanation! In fact, for my scene geometry I only need the closest hit indeed, but for my grid geometry I need all intersections that are closer than the scene geometry intersection. I want to know all the grid cells that are closer along the ray than the closest scene intersection. I assume there is no other way around than to create an any-hit shader for the grid geometry and test against the closest hit of the scene geometry? Also, if I create 2 separate GASes, one for my scene geometry and one for my grid geometry, how does OptiX know which any-hit (or closest-hit) shader belongs to which GAS? In other words, how does it handle the intersections of different GASes separately?

Thanks again,

  • Chuppa

I assume there is no other way around than to create an any-hit shader for the grid geometry and test against the closest hit of the scene geometry?

If you need the list of all hits for your grid geometry, then yes you’ll need an any-hit shader.

One trick that could be helpful here - if the bounds of your grid geometry are both smaller and contained entirely inside of the bounds of your scene geometry, what that means is that your scene geometry will be traversed first (because the ray will hit the scene bounds first before it hits the grid bounds.) If that’s the case, then you could take advantage of the fact that you will know the correct scene t-hit value before your grid is traversed, which would allow you to automatically cull any grid intersections with t values that are greater than that without having to store them. Your ray’s tmax value will also be adjusted for you by having hit the scene geometry, so your ray will not end up traversing through the entire grid geometry, which will be faster than traversing the grid first.

It may be possible to add dummy geometry into your scene in order to force the traversal order so that you can do the above. If that doesn’t work, or isn’t viable for you, then the alternative is to store all the t-hit values that you find in your grid any-hit shaders, and then do the culling of your list after the ray is done with traversal, or in other words, after the call to optixTrace() returns.

I create 2 separate GASes, one for my scene geometry and one for my grid geometry, how does OptiX know which any-hit (or closest-hit) shader belongs to which GAS?

This is what the SBT and various SBT offsets are for. You setup the SBT to have different shaders available, and then put SBT offsets into your accel structures & ray to tell it which SBT entry to use. So conceptually it’s very simple: you have some combination of different geometries, ray types, and materials. The SBT is just a lookup table to make sure the right shader gets invoked when a ray hits something.

So in your case, you can include 2 (or more) different hitgroups in your SBT, one for the grid geometry and one for the scene geometry. Then when you build your GASes (Geometry Accel Structures), each one takes an SBT offset you can use, and in addition, the OptixInstance structure used to build your IAS (Instance Accel Structure) also takes an SBT offset. These can be used individually or in combination to indicate which hitgroup should be invoked upon intersection for each geometry & ray type, for example. The indexing is flexible, which can make it a little confusing at first, but keep the SBT indexing formula nearby and it should be relatively straightforward (also take a minute or two to stare at the examples in this section until they make a little sense): https://raytracing-docs.nvidia.com/optix7/guide/index.html#shader_binding_table#acceleration-structures


David.

1 Like

Thank you for the help! So if I understand correctly I will create 2 separate GASes: one based on the scene geometry and one based on the 3D grid geometry. Then I create 1 IAS, built over these 2 GASes. After setting up the SBT and corresponding offsets correctly, I pass the IAS traversable handle to my optixTracecall? Sorry if this is trivial, but I am not sure if I 100% understand the concept of an IAS, is the idea behind it merely to prevent copies of geometry models that only differ in transformation redundantly being stored multiple times? Reading the programming guide example, I notice that the GAS traversable handle is passed to the optixInstance used to build the IAS. But in my case I have 2 GASes, how do I build my IAS over both these GASes?

Thanks in advance,

Chuppa

Yes, you’ve got it. You’d have 3 calls to optixAccelBuild(), one for each of 2 GASes and one for the IAS.

To build an IAS over multiple instances, you’ll notice in the example you linked to, the last line of code sets numInstances to 1. With more that one instance, you would include multple transforms in your instance buffer, then probably rename d_instance to be plural d_instances and make sure to copy the whole buffer (i.e. in the cudaMemcpy, change sizeof(OptiXInstance) to numInstances * sizeof(OptiXInstance), and finally set your numInstances value to the number of instances you have.

There are a handful of OptiX SDK samples that demonstrate use of IASes, including optixDynamicGeometry, optixDynamicMaterials, optixHair, optixMotionGeometry, optixSimpleMotionBlur, and optixVolumeViewer.

There are multiple reasons to use an Instance Accel. One of them is to allow duplicate copies of a model in a scene without having to store any duplicates in memory. This is a common and powerful way people ray trace forests and cities, for example, by scattering duplicate trees, or duplicate buildings/cars/people.

Another reason to use instances is to avoid having to model your geometry in world space. It’s better to model things near the origin, and you can use the instance transform to place it somewhere else. This is good for animation, and good for precision, among other things.

A third reason to use instances in OptiX is to mix different geometry types into a single scene, since OptiX currently only has a single geometry type per GAS.


David.

1 Like

Ok great, that will help me on my way, thanks for all the help and great explanations!

One final question if I may: I was implementing the construction of an IAS, and was not sure if I am correctly calculating the OptixInstance.sbtOffset. Let’s assume I have 2 GASes, each containing 1 SBT record per build input. If I would like to create i instances, is it correct then if I calculate it as follows: sbtOffset = sbtOffset + i * numRayTypes * gasNumBuildInputs;? sbtOffset is first initialized by 0. For the remaining parameters, numRayTypes stands for the amount of ray types of the program, i stands for the index of the current instance and gasNumBuildInputs is the number of build inputs for the GAS that will be assigned to the IAS. If I want to assign multiple GASes to one IAS (as in my use case), I guess I can take the sum of the number of build inputs of each GAS instead of gasNumBuildInputs here? The calculations are a bit tricky, I hope my question was clear.

Thanks in advance,
Chuppa

Whether it’s correct depends on the order of your hitgroups in the SBT. :) The offsets you use just need to match what you put in the table. I’ve had cases where I use a prefix-sum like you describe, when I don’t have an equal number of hitgroups corresponding to each instance, for example. But, keep in mind that it’s okay to have blank entries in your SBT, there is no requirement that your table entries are contiguous. So if it makes the indexing easier to think about, you can allocate for the maximum number of ray types, materials, and geometries that you have, such that there is always a constant SBT stride between entries for two consecutive instances, a different constant stride between SBT entries for two consecutive ray types, etc.

One thing that can make this really easy is to create a map or dict structure that keeps your SBT index for each geom-raytype-material combination. You can insert one of these names for each hitgroup / SBT entry you make, and put the SBT index into your dict. If you use a dynamic array of hitgroups, and insert the names as you go just before inserting the hitgroup entry, then the index to save is always the current size of the hitgroup array. Then when building the accel structures, just lookup the index/offset you need to use by name.

Do note that because of the indexing formula, it’s easier to group first by ray type (meaning put different ray types for a given geom+material immediately next to each other in your SBT), and then group by geometry type, and finally by instance. sbt-index = sbt-instance-offset + (sbt-geometry-acceleration-structure-index * sbt-stride-from-trace-call) + sbt-offset-from-trace-call

The instance example of SBT offsets is pretty close to the scenario you’re describing, it might help clarify things in this case. The table shows how the offsets correspond to the 2 build inputs & SBT indexes: https://raytracing-docs.nvidia.com/optix7/guide/index.html#shader_binding_table#example-sbt-for-a-scene


David.

1 Like

Another benefit of using instances is that this allows to use the eight bit visibilityMask on the instance and the optixRay call to quickly determine if a ray needs to check against a whole instance AABB or not.

Means in your case you could distinguish the geometry GAS from the 3D grid GAS contents at the instance level and wouldn’t need to use an anyhit program on the “geometry” ray type to reject the grid primitives because the visibilityMask could already reject the whole instance with the 3d grid at the top level and vice versa.

That in conjunction with the OPTIX_{RAY,GEOMETRY,INSTANCE}_FLAG_DISABLE_ANYHIT would allow all kinds of different layout options for the SBT.

When you really only have two GAS, it would also be possible to simply store both separately in your launch parameters and then use either the traversable handle of the geometry or the 3D grid for your two different ray types.

Using instances allows for more flexible SBT use cases though and the traversal through a two-level hierarchy (IAS->GAS) is fully hardware accelerated on RTX boards.

1 Like

Here’s a little update. I tried to implement my IAS building method using the indexing formula. So in my case, I have 2 GASes. One for the scene geometry, one for the grid geometry. Each GAS has a number of buildInputs. For the scene geometry this number is equal to the number of objects in the scene (each object has its own mesh), for the grid geometry, this number is equal to the amount of cells (each cell has its own mesh). Every build input for both GASes has OptixBuildInput::triangleArray.numSbtRecords set to 1 (so I expect the SBT GAS index to be equal to the index of the build input). I have only 1 ray type.

I have 1 raygen program, 1 miss program, and 2 hitgroup programs (each existing of an any-hit and a closest hit shader, so in fact 4). When I build my SBT, I make hitgroup records for both my scene geometry and my grid geometry, and I let them refer to their corresponding program respectively by calling: OPTIX_CHECK(optixSbtRecordPackHeader(hitgroupPGs[0], &rec)); (for the grid geometry the index in hitgroupPGs is set to 1). For the scene geometry, the amount of hitgroup records made is equal to the amount of objects in the scene, for the grid geometry, the amount of hitgroup records made is equal to the number of cells.

When I create the IAS (I create 2 instances, 1 for each GAS), I initialize sbtOffset to 0 and calculate the sbtOffset per instance as follows: sbtOffset = sbtOffset + numRayTypes * gasNumBuildInputs; // Assuming that each GAS only has 1 SBT record per build input!, with gasNumBuildInputs being the amount of build inputs for the GAS assigned to that instance (here I made the assumption each GAS only has 1 SBT record per build input, so the SBT GAS index is equal to the build input index). As the transform I just pass the identity matrix for both instances.

I put some prints in my device hitgroup programs to test if the program compiles and runs as expected.

My question now is: when I set the instance visibility mask to 255 for my instances, my device program does some prints that it hit grid geometry in the any-hit shader, but then crashes giving a runtime error due to illegal memory access. If I remove this line of code, the program runs, but nothing is printed (and thus the shaders are not executed as they should be). What could possibly be the cause of this?

Thanks in advance,
Chuppa

I’m not sure, but I could guess it’s likely to be one of two things. It could be an illegal memory access in your grid anyhit shader. If you are saving hits and have enough that you run off the end of the array, something like that could do it. Another possibility is that the scene geometry SBT index ends up wrong and causes OptiX to run off the end of the SBT. You can determine which one it is by turning off only the scene geometry (while leaving the grid geometry enabled) and then testing whether the grid anyhit shader still crashes. If so, it might indicate a legitimate bug in the shader, and if not it might indicate a problem with the SBT offset of the scene geometry. The visibility mask usage is just allowing you to hide the problem by disabling the rendering of any geometry before the SBT is accessed and before the grid anyhit shader is called. I’d recommend waiting to play with visibility masks until the SBT offsets are all verified working, and recommend enabling one geometry/instance at a time so you can isolate and resolve any problems.


David.

1 Like

The problem indeed was in calculating the SBT instance offset. I accidentally was using the number of build inputs of the GAS assigned to the current instance to calculate the offset, but instead I obviously had to use the number of build inputs of the GAS assigned to the previous instance (with an exception to the first instance, which just gets offset 0). Thanks for all the help!

For anyone who would want to do anything similar in the future, here is my OptixInstance initialization code where I calculate the OptixInstance.sbtOffset:

	IAS::IAS(OptixDeviceContext& context, std::vector<glm::mat4> transforms, std::vector<GAS> gases, int numRayTypes, std::vector<int> gasIndices)
	{
		// Initialize OptixInstances
		std::vector<OptixInstance> instances;
		unsigned flags = OPTIX_INSTANCE_FLAG_NONE;

		unsigned int sbtOffset = 0;
		for (int i = 0; i < transforms.size(); i++)
		{
			// If i == 0 the SBT offset needs to be 0
			int prevGasNumBuildInputs = 0;
			if(i > 0) {
				// We take the amount of build inputs of the previous GAS to decide our SBT offset
				prevGasNumBuildInputs = gases[gasIndices[i - 1]].getNumBuildInputs();
			}

			sbtOffset = sbtOffset + numRayTypes * prevGasNumBuildInputs;	// Assuming that each GAS only has 1 SBT record per build input!
			
			OptixInstance instance = {};
			memcpy(instance.transform, glm::value_ptr(transforms[i]), 12 * sizeof(float));
			instance.instanceId = i;
			instance.sbtOffset = sbtOffset;
			instance.visibilityMask = 255;
			instance.flags = flags;
			instance.traversableHandle = gases[gasIndices[i]].traversableHandle();

			instances.push_back(instance);
		}

		// Build the actual IAS
		build(context, instances);
	}

Chuppa

1 Like

Maybe something to add to this topic since it is related: is there a performance difference between doing 2 optixTrace calls on the separate GAS traversable handles instead of 1 optixTrace call on the IAS traversable handle? If not, I think in my case it might be better to go for the separate calls approach, since I only need the closest hit of the scene geometry and then reject all further intersections with the grid geometry. I could then just make a first call on the scene GAS, save the closest intersection, then make the second call on the grid GAS and check each intersection against the saved closest scene intersection. Another valid solution indeed would be to add dummy geometry to the scene to force its bounds are hit first, as @dhart suggested.

Perf-wise, it’s definitely far better to trace only 1 ray against the IAS than trace 2 rays. That doesn’t necessarily mean you should avoid the 2 ray approach, but I’ll explain why. Calling optixTrace is a function call that has to trigger the RT core to begin traversal, which passes parameters, uses registers, etc., and then transfers control to the RT core, and later transfers control back to the CUDA core, so there are overheads with the call, and also with the round-trip between cores. When tracing against the IAS, the RT core hardware handles the transition between the IAS and GASes without returning to the CUDA core, so the single IAS trace saves not just a function call, but also a complete round-trip to the RT core, and those things are more work than what the RT core has to do to transition from the IAS to the GAS traversal and back.


David.

1 Like

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.