When using a bindless buffer, it might happen that you try to access it out of bounds or access it before setting the address handle. In both cases, bad things happen to your application, including context crashes and driver crashes. A context crash in my definition means that after the OpenGL program has terminated, future allocations and make-resident calls yield an GL_OUT_OF_MEMORY error or similar errors. This is completely fine behavior in that case as you clearly setup or accessed the buffer improperly.
However, this can also happen without you even realizing these accesses happen. I found two very interesting cases that took me a while to track down. Here is a complete compute shader to demonstrate the issues:
#version 450
#extension GL_ARB_gpu_shader5 : enable
#extension GL_NV_shader_buffer_load : enable
layout(local_size_x = 32, local_size_y = 1, local_size_z = 1) in;
struct InputStruct
{
restrict readonly int* a; // Always set
restrict readonly int* b; // Not set
bool useB; // false
};
uniform restrict writeonly int* result;
uniform InputStruct input;
int getA(InputStrict s, uint idx)
{
return s.a[idx];
}
int getB(InputStrict s, uint idx)
{
return s.b[idx];
}
void main()
{
uint idx = gl_GlobalInvocationID.x;
// Variant 1: Access unset buffer in ternary operator with always false uniform condition
result[idx] = (input.useB) ? input.b[idx] : input.a[idx]; // Crashes sporadically
// Variant 2: Access unset buffer in second condition when first is already false
if ((!input.useB) || (input.b[idx] == 0)) // Crashes sporadically
{
result[idx] = input.a[idx];
}
// Variant 3: Access buffers with helper functions
result[idx] = (input.useB) ? getB(input, idx) : getA(input, idx); // OK
}
Variant 1: I have two different buffers to choose from - a and b - while b is not set, meaning I did not set the uniform uint64_t address handle of the buffer. I also have a uniform bool useB which is set to false from the host application. If I now use the ternary operator to choose between the two buffers, i.e.
result[idx] = (input.useB) ? input.b[idx] : input.a[idx];
the context will crash infrequently as described above, which I guess happens because input.b[idx] is evaluated despite the condition (input.useB) being uniform and false.
Besides the crash itself, it puzzles be why it only happens sporadically (read: about every second or third compute dispatch). Is this the random uninitialized value of the unset buffer handle input.b?
Variant 2: I have two boolean conditions that I OR, while the first one is uniform and true and the second one would perform an access to the unset buffer for comparison, i.e.
if ((!input.useB) || (input.b[idx] == 0))
This, again, will sporadically crash the OpenGL context and I assume this is due to the evaluation of the second condition despite the first one being true already.
I am aware that a lot of optimizations can be performed when allowing to evaluate everything at once instead of sequentially, as in Variant 1, but then it has to be taken care of that the evaluation does not crash the entire application.
Variant 3: One very interesting finding for me is that the crash in Variant 1 can be avoided when replacing the direct buffer accesses with indirections through helper functions that perform the actual buffer access. I don’t know which part of this modification makes this work - The use of a function? The copying of the input struct as function parameter? -, but it does.
So, although I am strolling closely to the limits of “undefined behavior” with my unset buffer here, I would still consider this a bug, as it is not obvious to the programmer that accesses to this buffer can actually happen despite a uniform condition that would prevent these accesses in host code with sequential evaluation of conditions.
Tested on a Win 8.1 64 bit system with driver version 372.70.