Sorry for the slow response.
I also responded briefly over email to @belakampis1, but I figured I’d also give a public response for anyone who might come across this thread.
I wouldn’t be surprised to discover that this is another DXC bug. I’ve had luck in the past in adding a reproducer to Sascha Willem’s Vulkan SDK examples. I think part of the issue is that many folks just don’t understand parts of the ray tracing pipeline like the shader binding table, anyhit, custom intersection shaders, callable shaders, etc. So, an example in Sascha’s SDK can be both educational for the community and can also demonstrate that the equivalent code in GLSL works while HLSL/DXC doesn’t, which can help DXC devs reproduce the issue and put pressure on them to fix it.
My workaround at the moment is rather invasive, but I am currently wrapping all HLSL entry points with my own C style macro, which allows me to inject code at the beginning and end of each entry point body. The bug occurs when intrinsics are called inside of functions, so you need to call the intrinsics only from the main entrypoint body.
So, I currently create two static booleans right above the macro, and write my own namespaced versions of the intrinsics:
static bool _ignoreHit = false;
static bool _acceptHitAndEndSearch = false;
namespace gprt {
void ignoreHit() {_ignoreHit = true; }
void acceptHitAndEndSearch() { _acceptHitAndEndSearch = true; }
}
#define GPRT_ANY_HIT_PROGRAM(progName, RecordDecl, PayloadDecl, AttributeDecl) \
/* fwd decl for the kernel func to call */ \
inline void progName(in RAW(TYPE_NAME_EXPAND) RecordDecl, inout RAW(TYPE_NAME_EXPAND) PayloadDecl, \
in RAW(TYPE_NAME_EXPAND) AttributeDecl); \
\
[[vk::shader_record_ext]] ConstantBuffer<RAW(TYPE_EXPAND RecordDecl)> CAT(RAW(progName), \
RAW(TYPE_EXPAND RecordDecl)); \
\
[shader("anyhit")] void __anyhit__##progName(inout RAW(TYPE_NAME_EXPAND) PayloadDecl, \
in RAW(TYPE_NAME_EXPAND) AttributeDecl) { \
progName(CAT(RAW(progName), RAW(TYPE_EXPAND RecordDecl)), RAW(NAME_EXPAND PayloadDecl), \
RAW(NAME_EXPAND AttributeDecl)); \
if (_ignoreHit) \
IgnoreHit(); \
if (_acceptHitAndEndSearch) \
AcceptHitAndEndSearch(); \
} \
\
/* now the actual device code that the user is writing: */ \
inline void progName(in RAW(TYPE_NAME_EXPAND) RecordDecl, inout RAW(TYPE_NAME_EXPAND) PayloadDecl, \
in RAW(TYPE_NAME_EXPAND) AttributeDecl) /* program args and body supplied by user ... */
#endif
I set these values to false when the entry point is called, then inject the user’s actual entrypoint code. If I determine that these virtual intrinsics have been set to true by the end of the entrypoint, then I call the real intrinsics at the end of the function.
Ideally I wouldn’t have to do this, since I wouldn’t be surprised if this completely destroys any ability to profile my kernels with debug information… With all the workarounds I have for HLSL, I’ve been starting to consider migrating to Slang instead…