I’m sure this is something simple, but it has me puzzled.
I am shooting rays through a cube of cubes (say a 60m3 cube made out of 1m3 cubes). I need to record any front face hit that a ray travels.
The way I do this is by having an outputbuffer with the size of num cubes and the instanceId is the corresponding index in that buffer. No matter if I atomicadd or simple set some other debug value, I am getting some weird effects.
It seems like hits only properly register closest to the origin of the ray and at some of the edges of the whole massing.
So some cubes “in between” don’t register any hits at all.
Any ideas what I might be missing?
Any hit isn’t enabled, enforcing it in the trace call doesn’t make a difference.
I’m not sure what that might be. Let me ask a few questions. It sounds like you’re using instancing for the small cubes? The cubes are using OptiX triangles, or a custom intersection program? How are you propagating a ray after a hit? When any-hit is disabled, you are relaunching the ray with another optixLaunch() call? What is the general ratio of missing hits to rays cast that you’re seeing, is it a small number of missing hits, or are most rays seeing this? How are the small cubes arranged - are they randomly placed, or arranged into a regular grid? Are the small cubes all exactly the same size? Do you have a scale value set in your instance transform that is different from uniform identity (1,1,1)? How repeatable are the missing cubes - is it the same set regardless of the rays; will very similar rays (neighboring pixels and/or neighboring subpixel rays) experience the same missing cubes, or see a completely different set?
Here are a few stupid ideas of mine and things to double-check, maybe shooting these down will help us narrow in on the real cause.
If there are any indexing errors in allocating, setting, or passing the instance transforms, that could leave some of them uninitialized, which would result in them not getting traced. And of course, double-check and verify that the number of instances and the buffer sizes passed to the BVH builder is correct, etc.
If the number of spurious misses is low, maybe the problem is numeric precision - rays striking the edge between two cubes and missing the front face. OptiX tries to guarantee that triangles that have shared edges in a mesh are watertight, but there is no way to guarantee this for separate instances, especially if the scale values in the transform are non-identity; floating point precision will cause errors that allow rays to escape.
Some things you might try:
Copy your instance transform buffer to the host and inspect it for NaNs, Infs, or other unexpected, uninitialized, or bad data.
Turn down the number of sub-cubes until the problem goes away, or becomes easy to debug.
If the set of missing cubes is repeatable, try peeling away layers of the big cube mass until you can see the missing cubes.
Try sending multiple rays per pixel with varying amounts of jitter to check whether they experience the exact same hits & misses.
Check whether using any-hit vs relaunching gives you the same set of unexpected misses.
Save out the data for one ray and it’s hits, and visualize it in a separate program to see if any patterns become obvious. (Personally, I like adding code in my OptiX launch params & programs that will limit printf output to the clicked pixel, and then print OBJ format data and open it in Blender… super easy!)
I’m taking stabs in the dark, just hoping to make a suggesting that might help debugging, but apologies in advance if these are unrelated or lead to wasting time. The other option, of course, is to create a minimal reproducer code sample you can share.
sorry, yes there were a lot of assumption from my part. Let me clarify.
The cubes are regular meshes, build in Unity and I send them as-is to Optix (it’s the first naive implementation, so no optimization). They are build out of triangles. I am using any hit, so not manually propagating after a CH.
As for the arrangement, it’s from a simple loop over x, y, z. I also verified with my debug renderer and they are in the scene, visible to Optix where they are supposed to be.
What it appears like in 1d is this (assuming on horizontal ray):
→
It looks like only the first and the last cube triggers an anyhit call. I recall that there are some specifics around any hit, so I was wondering if I was missing something. (Already did my browse through the documentation etc.) I also don’t think it’s an edge case where it leaks through meshes. (unless optix has an issue with vertices from different meshes being in the same position)
Cool idea with obj export!
But yes, I think it’s somewhere on my end then and will try out reducing complexity etc.
Check your ray flags when you cast the ray, is it perhaps using OPTIX_RAY_FLAG_TERMINATE_ON_FIRST_HIT accidentally?
Also check your anyhit program, make sure it’s not calling optixTerminateRay().
And I assume anyhit is enabled, of course, but see if there are any flags being used with _ANYHIT_ at all.
I hope it’s one of these easy things - and I know from experience how easy it is to lose track of some setting somewhere, since there are a lot of them in a lot of different places. But if none of those are the issue, then I’m definitely interested in figuring out how we can reproduce it on our end.
thanks! I have all of those covered. I have only OPTIX_RAY_FLAG_ENFORCE_ANYHIT (now, tried without before). the AH function really is just array[index] = index; so no terminateRay, checked my sources and no results for anyhit where there shouldn’t be.
I tried increasing the output buffer, just in case I write somewhere that I shouldn’t, same results.
And I assume anyhit is enabled
What exactly do you mean? I searched all over to see if I missed something, but I don’t think I specifically enabled anyhit anywhere. (outside of setting the moduleAH & entryFunctionNameAH in the prog group)
But really just having a check behind all those helped. I keep hacking away at it, but gladly come back to your offer if that doesn’t out in the next few days.
Oh, anyhit is enabled by default, I just meant make sure you haven’t disabled it anywhere, since there are multiple ways anyhit can be disabled.
We aren’t aware of any cases where a first anyhit is called but subsequent anyhits along the ray aren’t called. Another couple of super easy things to check are make sure t-max value is big enough to catch everything. Check whether anyhit is being called using a method that is independent of your data capture, for example perhaps by putting a printf in the anyhit program, limiting the printf to a specific pixel and seeing if the number of lines printed agrees with your dynamic queue size.
You can also inspect & instrument one of the SDK samples that uses anyhit to see if it appears to behave differently, and maybe why. Both optixCutouts and optixWhitted make use of anyhit programs, and depend on multiple anyhit invocations along the ray in order to render properly.
Thanks again! (And yes, double checked everything you suggested)
I am still having trouble getting stdout etc. to work properly in this stack running from Unity, but I did get some data out and it all checks out/is the same as I get reported in the end. I even tried printf’ing every single ray.
I have it working somewhat better now.
My assumptions was that writing to the same index in the outputbuffer wouldn’t hurt. So the initial buffer is set to 0 and at each intersection I just write the instanticeId in the buffer at the index taken from the instanceId.
Just for giggles I tried checking the buffer before writing.
So I went from
So wait, to make sure I understand - in the first case (your previous version where it wasn’t working), were you calling optixIgnoreIntersection()? I assumed that optixIgnoreIntersection() was always being called at the end of any-hit, but my mistake for not asking about that way before now. Not calling it (either intentionally or accidentally) could result in the kind of behavior you noticed. Always ignoring all intersections is the way to let a ray run through everything and process all potential intersections with the any-hit program, otherwise the ray will stop after finding a hit somewhere. (Apologies if I’m stating the obvious and what you already tried.)
If you were already ignoring all intersections before, and the new version is now only ignoring some of them and seems to work better, then I’m still stumped, and guessing that maybe any-hit is getting called but the data collection is failing to register it for some reason.
as long as I call optixIgnoreIntersection() after the atomicAdd.
Another possibility I didn’t entertain yet is that maybe you’re getting an exception in the any-hit on the atomic or something before that in your code. If the shader exited early due to an error before the intersection gets ignored, that might have the same effect as not calling ignore. Could be worth enabling exceptions and seeing if they’re happening. I guess if your stdout pipe isn’t working, it’s possible the default exception program will try to print things that you can’t see, so you might need to add your own custom exception program that routes status output to a buffer or something. Validation mode enables exceptions too, but the exception program will try to print to stdout.
So the line outputbuffer[instanceId] = instanceId - the idea there is to check that all instances were hit, because they’re all inside the view frustum, and it’s okay that many rays will try to write the same value?
I’ve read 11.13 a few times and just wasn’t clear to me that you have to call ignoreIntersection for the ray to continue.
Not with the way it was written there. I don’t want to ignore the intersection and in my case which intersection is closest doesn’t matter either. Reading it, it just made more sense to me that the default behaviour is that anyHit is called for intersection, unless terminate is called.
Well, that was a long week. Looking at the documentation, I am sure my mental model wasn’t correct. But maybe add a sentence reflecting this behaviour? But since this doesn’t seem to be a common problem I guess it’s on me.
Just for the record… Yes, I had the exception idea as well, and yes, for debugging I didn’t mind that the same value was written repeatedly. I even increased the outputbuffer 10000 times, just to see if I am writing into places I shouldn’t. Thankfully the way optix throws exceptions I can usually catch them. Just getting printf to work is cumbersome and not good for day to day dev.
Again, thanks a lot for taking the time to think this through with me. It really helped me eliminate some possible causes.