I’ve run into an issue where a normal vector which is assigned a nonzero value in a custom geometry intersection program has all zero values in the material program. The attribute is declared as follows
rtDeclareVariable( OverloadedAttributes, oaOverload, attribute oaOverload, );
The use cases triggering the problem are using the nonTriangleNormal member. Removing the “union” and allowed the attribute to be an oversized struct has no effect. Adding “float pad[ 2 ];” prior to the union (or vectors when the union keyword is absent) avoids the issue and the code works as expected. Larger float array pads also avoid the issue, a single float pad does not avoid the issue.
We’ll test with newer drivers once I can get an admin to update it. Any chance this is a known (fixed?) issue?
Current test configuration:
CUDA 10.0 and OptiX 6.5
Problem has not occurred on Linux with 470 and earlier drivers (recently tested with 470.94) and assorted GPU hardware.
Latest Linux drivers not yet tested.
Problem has not occurred on Windows with older drivers (recently tested with 472.84) or with the 516.25 driver, and assorted GPU hardware.
A couple of questions -
I assume this is a dumb question and the answer is yes, but are you using the OverloadedAttributes declaration on both ends of the attribute, both in the custom intersection program and in the material program? (Curious if the intersect program might only know about the normal attribute, for example.)
Does the struct change size when you use a union, and is the size reported the same in both intersect and shade?
I tried to superficially reproduce this problem in the
optixWhitted SDK sample using your struct instead of the
shading_normal attribute. I was unable to reproduce any problems. I changed sphere_shell.cu (intersect) and glass.cu (shade).
Can you share a minimal reproducer in source or binary form, and/or create a reproducer out of one of our SDK samples?
Yes. There’s only one declaration of the struct and attribute, and it’s in a header included everywhere.
sizeof results are the same for the intersection and material programs, and show no signs of padding.
Using union: 12
Using struct without union: 24
Using union with 2 float pad: 20
Using optix::float4 for normal (error still occurs): 16
The results above are from the 515.48.07 driver, which appears to behave the same as the 510.
I don’t currently have a minimal reproducer. The issue occurs with one ray generation program but not another, so I’m guessing it requires some particular combination of memory allocations/alignment to trigger.
It seems like printing the address of the attribute in the intersect/material programs for the pass/fail cases might be interesting, but taking the address of an attribute is not allowed. Is there any good-enough-for-a-debug-print workaround for that?
Okay, thanks for confirming. Does the behavior change between debug & release configurations?
Printing attribute addresses is tricky as a debug technique since the whole reason for the attribute mechanism is to keep them in registers whenever possible; just attempting to inspecting the attribute’s address might force the attributes into memory & compile them a different way than you would get normally. What I might recommend instead is taking a few minutes to study the PTX and also the SASS code in the region of the attribute access to see if you can tell where it’s going wrong. PTX should be easy to get since your build or app generates it. For SASS you might use Nsight Compute. Not saying this will be easy, but if you can tell whether the problem is present in PTX or if it doesn’t show up until the SASS code, that does give us information about where the bug is. If that turns out too difficult it might be better to spend time figuring out how to reproduce in a sample, or minimize and package up a reproducer.
The union might just be sometimes confusing the compiler, assuming there’s a compilation bug. I assume that if you use a single float3 and no union and no struct, then everything works fine? It does sound like there’s some complexity to the initial conditions that trigger this problem. Is it realistic for you to create a reproducer we can use at some point in the future?
Some further debugging on a different issue (Illegal address error when using both GeometryTriangles and Geometry nodes) makes me think the two are related. I’m thinking the moral of the story is to stop putting structs in attributes.
That’s maybe not a bad idea just so you can stop fighting it and move forward. I have to ask the compiler team, but it might not be surprising if structs in attributes aren’t well tested. Doing manual masking may be less likely to bump into weird compiler corners. That said, we would like to make sure it’s either compiling correctly or reporting an actionable error message. We can take your notes and try to generate some tests, but if you do end up with more context or a reproducer you can share, it would certainly help us patch the holes. In any case, thank you for reporting this.