Traversing order

Hi,

I’m looking to report each unique SBT hit along a ray.
My plan is to have a buffer per GAS, to use an anyhit program with optixGetInstanceId() to decide which buffer to write to and then optixGetSbtGASIndex() to decide the index in the buffer.

Since I’m only interested in one hit per SBT I need an efficient way of skipping the subsequent primitives. Is there an easy way to do this? My first thought was to store the indexes on the ray and then compare to check if the next hit is on the same SBT and instance. But that would require the tracing to be linear to such an extent that all hits within an SBT is reported in sequence. Is this true?

Would you maybe know of any other efficient concept for this?

Thanks,
Oscar

I’m looking to report each unique SBT hit along a ray.
Since I’m only interested in one hit per SBT…

Could you reformulate that question by describing exactly the desired data layout of the result vector?

I have too many open questions why you would need this and what the resulting output from a launch should be.

Do you need this per ray, or do you only need to know which SBT hit records are touched by all rays in a launch?
You do not care about stopping rays anywhere? (Like X-Rays)

This is purely for geometric analysis and nothing graphical.

The final result would be how much a specific geometry obstructs one or more targets in relation to one or more viewpoints.

So each ray is first traced on a traversable with my targets, the ones that miss don’t report anything, the ones that hit triggers a second pass. The second pass has the distance to the target as tmax, the target index in the prd and is traced on a IAS traverable with one or more “obstruction” GAS. I then have as many HitBuffers as “obstruction” GAS instances in the obstruction IAS, which are target count * obstruction count in size. For each unique SBT index/unique mesh along the ray I then want to atomicAdd to

HitBuffer[obstruction GAS index][target count * SBT Index + target index]

There might in the future be of interest to also incorporate the viewpoint index but leaving that for now.
As a result I will then be able to query how much geometry number x in gas number y obstructs target number z.

Hope that explains my intent.

Hi Oscar,

Any-hit shaders are not guaranteed to be called in depth order, and in practice you will see them called out of order. To guarantee you have a hit program called in depth order, you would need to use closest-hit, and potentially re-launch your rays.

I think your any-hit plan sounds reasonable though, with slight modifications, so worth trying it out. My gut reaction is that you might see some contention using atomicAdd() that could slow things down, so it could be worth thinking about alternative approaches, depending on how efficient this part of your pipeline needs to be. With the any-hit approach, you would probably need to keep track of and sort the hits you want to remember via your payload, and wait until the ray traversal is complete before writing into your HitBuffer structure in raygen.

Another way to structure the problem might be to cast a kernel of primary closest-hit rays, with any-hit shaders disabled for extra performance. And then collect a buffer of hits to relaunch in a 2nd kernel, and this time use an any-hit program that ignores intersections that match the SBT index of the first hit. Your closest-hit program would finish the job by recording the SBT & GAS index of the 2nd hit, which is guaranteed to be different from the first hit. You could do this without atomics, and after that, perform a reduction on the two buffers. This may or may not be faster than using atomics, it would need to be tested.

Or you could mix the ideas and use a single kernel with any-hit, store the first and second hits into a buffer that matches your image pixel dimensions, and reduce on that single buffer after tracing is complete. You would need to perform your own depth sort on the two hits in your any-hit program as you go by comparing each incoming hit to the two stored hits and shuffling the stored hits or discarding the incoming hit as necessary - takes a little work but I would guess is doable.


David.

Hi David,

Thanks for you suggestions! Will look into the double kernels, I guess that also gives me more coherent rays per launch compared to the single kernel doing both jobs.

The dept order wouldn’t really help because I can’t ensure that multiple meshes won’t intersect. I just want to know if the mesh has at least one face that intersects with the ray.

Tested it now with the “store only previous index approach” and it’s double counting just as you say, when multiple meshes are close to each other. So I guess I have to collect all the unique hits along the ray some way in the prd to make sure I only count each SBT once.

Here’s just a test with 4 viewpoints looking down with the ground plane as a target. View cover from the ball meshes is graded from high-red to low-green.

Thanks
Oscar

Cool picture!

The dept order wouldn’t really help because I can’t ensure that multiple meshes won’t intersect. I just want to know if the mesh has at least one face that intersects with the ray.

Ah, so I think I misunderstood your question at the top of the thread about the linear ordering, I thought you meant depth order. I’m guessing the answer is still the same either way, that no linear ordering of anything is guaranteed when it comes to any-hit programs.

So I guess I have to collect all the unique hits along the ray some way in the prd to make sure I only count each SBT once.

This is another one I might not have grokked correctly. I had it in my head that you only wanted the first two unique SBT entries, where an SBT entry is a proxy for a unique combination of mesh & material. If you need all of them along a ray, and there can be an arbitrary number, it gets a little harder, of course.

There’s a sample in the old Advanced Samples repo that does something a little bit like what you’re suggesting. Not the same, only vaguely similar. But might be worth at least glancing at. This sample is rendering transparent particles, and in order to get the compositing approximately correct, it collects a buffer of hits along the ray, and then sorts the buffer by depth order in raygen, after tracing the ray. It blindly appends to the buffer, and will silently discard hits once the buffer is full. One could be slightly smarter about it and evict any hits with a t value greater than the incoming hit. You could also imagine using a small hash table of SBT indices rather than unique hits.

https://github.com/nvpro-samples/optix_advanced_samples/blob/master/src/optixParticleVolumes/raygen.cu

https://github.com/nvpro-samples/optix_advanced_samples/blob/master/src/optixParticleVolumes/material.cu


David.

Thought I’d post the Anyhit program I’ve ended up with for my second trace:

extern "C" __global__ void __anyhit__occlusion()
{
	// Get ID for this geometry (int2 with sbt index and instance index)
	int2 geoID = make_int2(optixGetSbtGASIndex(), optixGetInstanceId());

	// Get ray data ptr
	GeometryObstructionPRD* prd = reinterpret_cast<GeometryObstructionPRD*>(getPRD());

	// Get total count of geoIDs hit so far
	int geoIDCounter = prd->geoIDCounter;

	// Check if the current geometry has been seen before
	bool addgeoID = true;
	if (geoIDCounter < GEO_ID_ARRAY_SIZE) // have we filled the array yet?
	{
		for (int i = geoIDCounter; i >= 0; i--) // move backwards from the most recent geoID
		{
			if (prd->geoIDs[i] == geoID) // has the current geometry been seen before?
			{
				addgeoID = false;
				break;
			}
		}
	}
	else // we have filled the array and have to account for looping past the end
	{
		int startIndex = geoIDCounter % GEO_ID_ARRAY_SIZE; // find the index of the most recent geoID
		for (int i = GEO_ID_ARRAY_SIZE - 1; i >= 0; i--) {

			int index = i + startIndex < GEO_ID_ARRAY_SIZE ? i + startIndex : i + startIndex - GEO_ID_ARRAY_SIZE; // get the actuall index in the array

			if (prd->geoIDs[index] == geoID)// has the current geometry been seen before?
			{
				addgeoID = false;
				break;
			}
		}
	}

	// Add geoID and hit if it hasn't been seen before
	if (addgeoID)
	{
		// add new geoID to the array
		prd->geoIDs[(geoIDCounter + 1) % GEO_ID_ARRAY_SIZE] = geoID;
		prd->geoIDCounter++;

		// add hit to this geoID hit buffer index for this target material
		uint32_t* GASbuffer = reinterpret_cast<uint32_t*>(params.hitBuffers[geoID.y]);
		atomicAdd(&GASbuffer[geoID.x * params.materialCount + prd->targetIndex], 1);
	}

	// continue trace
	optixIgnoreIntersection();
}

The prd holds an array of previous hits which is checked against to make sure the current geometry hasn’t been seen before. Looping through the array in order of last seen assuming it’s most likely that we’re hitting a recent geometry again.

But I still haven’t implemented a double kernel launch so there is a big difference in computation between the threads that hit or miss in the first pass, which is bad for performance if I understand correctly. @dhart would you have any tips or examples on how to store the data from the 1st kernel and reduce the buffer so a 2nd kernel only traces the rays that hit something in the 1st kernel.

Thanks
Oscar