Local variable reverting to value from previous call to function

I’ve run into an odd bug after a driver update. I have an inlined device function which does shadow checks, and the function is called inside a for loop over the lights in the scene. With the 527.27 driver, the value of a variable in the function is correct for the first light, but when the function runs for the second light, the variable’s value reverts to the previous light’s value partway through the function. I cannot share the actual code, but I’m attaching a pared down psuedocode version of the function that hopefully illustrates what I’m seeing.
I can work around the issue with some extra copying and volatile variables. Are there any known optimizer issues that might cause something like this?

Current test configuration, with variable reversion problem:
Windows 10 21H2
dual Quadro RTX 4000
527.27 driver
Visual Studio 2017
CUDA 10.0
OptiX 6.5

Configurations without the problem:
Same machine as above with 516.59 driver. Various other Windows/Linux machines with 510 series and previous drivers and Maxwell/Pascal/Turing/Ampere GPUs (some Quadro, some GeForce).


__device__ __forceinline__ void lighting(/*etc*/)
{
  for ( int currentLight = 0; currentLight < lightCount; ++currentLight ) {
    const float shadowScale = shadowCheck( hitPoint, lights[ currentLight ], normal );
    //etc    
  }
}

__device__ __forceinline__ float shadowCheck(
  const optix::float3& hitPoint, 
  const Light& light, 
  const optix::float3& normal, /*etc*/ )
{
  optix::float3 lightDirection;
  bool useFiniteRay;
  if ( light.type == X ) {
    lightDirection = optix::normalize(/*etc*/);
    useFiniteRay = true;
  } else if (light.type == Y ) {
    lightDirection = /*etc*/;
    useFiniteRay = false;
  } else {/*etc*/}
  
  // These prints are OK for both lights.
  rtPrintf( "lightDirection %f, %f, %f\n", lightDirection.x, lightDirection.y, lightDirection.z );
  rtPrintf( "useFiniteRay %i\n", useFiniteRay );
  
  if ( /*lightDirection vs normal and various flag checks*/) {
    RayShadowPayload payload;
    //etc;
    
    do {
      // This print is wrong when the function is called for light 1. It has the 
      // lightDirection from the light 0 call.
      rtPrintf( "lightDirection %f, %f, %f\n", lightDirection.x, lightDirection.y, lightDirection.z );
      
      // This print is OK until I add a workaround for the lightDirection issue (copying and restoring 
      // lightDirection from a volatile struct). Then useFiniteRay starts showing the light 0 value when 
      // running for light 1. Making useFiniteRay volatile works around that issue.
      rtPrintf( "useFiniteRay %i\n", useFiniteRay );
      const float maxRayLength = useFiniteRay ? /*etc*/ : RT_DEFAULT_MAX;
      const optix::Ray shadowRay = optix::make_Ray(lightDirection, maxRayLength, /*etc*/);
      rtTrace(/*etc*/);
    } while (/*various flags*/);
  }
  return payload.value;
}

Hi @bdr, thanks for the report. This does look to me like a bug. It would be super helpful to get a reproducer somehow, since this issue could be intimately related to very specific things about your code, the payload or attributes, or your specific compilation config or execution environment. Do you think it would be practical for you to try to modify one of our SDK samples to get this to reproduce? I’d also be curious to hear if the symptom still occurs when you comment out the rtTrace() call, that might help narrow down the possible causes.


David.

Removing the rtTrace call does avoid the issue. I’ve also found I can avoid the issue by removing the do/while loop, or by printing lightDirection again after the loop. Replacing the loop with an equivalent goto does not avoid the issue.
I’ll make an attempt to duplicate the behavior with an SDK example, but with this kind of issue I’m not optimistic.

1 Like

Understood, thanks in advance for trying. And thanks for checking on rtTrace; that will be helpful. We’d obviously like to fix it quickly. I’ve filed a bug report in the mean time.


David.

No luck with SDK example duplication. I’m guessing the particulars of the calling code are part of what triggers it. I have a slightly simpler version of the pseudocode that still duplicates in the original context.

__device__ __forceinline__ float shadowCheck(
  const optix::float3& hitPoint, 
  const Light& light, 
  const optix::float3& normal, /*etc*/ )
{
  optix::float3 lightDirection;
  lightDirection = optix::normalize(/*etc*/);  
  // These prints are OK for both lights.
  rtPrintf( "lightDirection %f, %f, %f\n", lightDirection.x, lightDirection.y, lightDirection.z );
  
  if ( /*lightDirection vs normal and various flag checks*/) {
    RayShadowPayload payload;
    //etc;
    
    do {
      // This print is wrong when the function is called for light 1. It has the 
      // lightDirection from the light 0 call.
      rtPrintf( "lightDirection %f, %f, %f\n", lightDirection.x, lightDirection.y, lightDirection.z );
      const optix::Ray shadowRay = optix::make_Ray(lightDirection, RT_DEFAULT_MAX, /*etc*/);
      rtTrace(/*etc*/);
    } while (/*various flags*/);
  }
  return payload.value;
}

Bug and workaround behavior is the same in driver version 531.18. I cleared the OptiX disk cache before testing for good measure.

This has been noted in the open bug report. Thank you for testing & updating here!


David.

I stumbled into a new manifestation of this issue after switching the struct holding an array of Light object pointers from an rtDeclareVariable to a pointer. The struct/array access and deferencing occurred in the calling function, but making that change causes a problem with light.position, an optix::float3 member variable. Some debug printing shows that when used inside the do/while loop, the address of light.position reverts to the value from the initial call (light 0). Changing the function signature to take light as const Light* const volatile light works around this iteration of the problem.
I’m not sure if this provides any useful new information, but it seems state handling around rtTrace calls is unreliable under some hard-to-duplicate conditions.

Windows 10 Pro 22H2
dual Quadro RTX 4000
528.86 driver
Visual Studio 2017
CUDA 10.0
OptiX 6.5