Mesh artifacts when using anyhit for transparency Optix 7

Hi all,

I’m writing a renderer in optix 7 originally based on the Siggraph Course from Ingo Wald. I want to implement a transparency setting which makes a ray skip the first hit on a geometry if the geometry is set to be transparent.

I think a good option is to use an anyhit program and a payload on the ray which counts how many times the ray has hit a transparent geometry. I’ve implemented it like this (all geometry that is not transparent has the OPTIX_GEOMETRY_FLAG_DISABLE_ANYHIT flag):

   extern "C" __global__ void __anyhit__radiance()
  { 
	  unsigned int hitcount = getPayloadWindow();
	  if (hitcount < 1)
	  {
		  setPayloadWindow(hitcount + 1);
		  optixIgnoreIntersection();
	  }
	  else
		  optixTerminateRay();
  }

But this gives me artifacts on the transparent faces:

Here’s a render without transparency:

Also here is my current workaround in the closest hit program where I instead retrace the ray on the other side of the face (Adds about 50% computetime compared to anyhit in my usecase):

if (sbtData.transparent && getPayloadWindow() < 1)
	  {
		  const float u = optixGetTriangleBarycentrics().x;
		  const float v = optixGetTriangleBarycentrics().y;

		  const vec3f surfPos
			  = (1.f - u - v) * sbtData.vertex[index.x]
			  + u * sbtData.vertex[index.y]
			  + v * sbtData.vertex[index.z];

		  vec3f pixelColorPRD = vec3f(0.f);
		  uint32_t u0, u1, u2,
		  packPointer(&pixelColorPRD, u0, u1);
		  u2 = getPayloadWindow() + 1;

		  optixTrace(optixLaunchParams.traversable,
			  surfPos + 1e-3f * rayDir,
			  rayDir,
			  0.f,    // tmin
			  1e20f,  // tmax
			  0.0f,   // rayTime
			  OptixVisibilityMask( 255 ),
			  OPTIX_RAY_FLAG_DISABLE_ANYHIT,//OPTIX_RAY_FLAG_NONE,
			  SURFACE_RAY_TYPE,             // SBT offset
			  RAY_TYPE_COUNT,               // SBT stride
			  SURFACE_RAY_TYPE,             // missSBTIndex 
			  u0, u1, u2);

		  prd = pixelColorPRD;
	  }

But gives the output I’m looking for:

Any help would be much appreciated!

Any-hit programs are not processed in ray direction order but in BVH traversal order.
Means you cannot exit after the first any-hit invocation and assume that was the first face along the ray direction.

Another issue is that the BVH could do splitting, which means that one primitive is contained in multiple AABBs and therefore could call into the any-hit program multiple times depending on the BVH traversal order. Means counting of intersections inside the any-hit program is not trivial either.
Please have a look at the descriptions below OPTIX_GEOMETRY_FLAG_REQUIRE_SINGLE_ANYHIT_CALL here which addresses this.
https://raytracing-docs.nvidia.com/optix7/guide/index.html#acceleration_structures#primitive-build-inputs

Given your current example geometries are all convex, the easiest approach to skip the front faces for those would be to do front face culling which is a flag on the ray in OptiX 7.
Search the OptiX Programming Guide for OPTIX_RAY_FLAG_CULL_FRONT_FACING_TRIANGLES.

That alone would affect all geometries, but the face culling can be disabled at instance level though, means you could set OPTIX_INSTANCE_FLAG_DISABLE_TRIANGLE_FACE_CULLING for the opaque objects which would override the ray flag.

But if you need to handle concave shapes as well, the continuation ray would be the most robust approach to process intersections in order.

1 Like

Thanks for the reply, explains it all!

Thing is the geometry could really have any shape and be convex, concave or single faced so OPTIX_RAY_FLAG_CULL_FRONT_FACING_TRIANGLES would probably not work. Sorry for not clarifying.

I guess I’ll stay with the continuation ray then (I’m assuming you’re referring to my second approach).
Would you have any suggestions on a more efficient implementation of this? Here’s my full closest hit program at the moment:

  extern "C" __global__ void __closesthit__radiance()
  {
	  const TriangleMeshSBTData &sbtData
		  = *(const TriangleMeshSBTData*)optixGetSbtDataPointer();

	  const int   primID = optixGetPrimitiveIndex();
	  const vec3i index = sbtData.index[primID];
	  const vec3f rayDir = optixGetWorldRayDirection();

	  vec3f &prd = *(vec3f*)getPRD<vec3f>();


	  // compute if window is transparent

	  if (sbtData.transparent && getPayloadWindow() < 1)
	  {
		  const float u = optixGetTriangleBarycentrics().x;
		  const float v = optixGetTriangleBarycentrics().y;

		  const vec3f surfPos
			  = (1.f - u - v) * sbtData.vertex[index.x]
			  + u * sbtData.vertex[index.y]
			  + v * sbtData.vertex[index.z];

		  vec3f pixelColorPRD = vec3f(0.f);
		  uint32_t u0, u1, u2;
		  packPointer(&pixelColorPRD, u0, u1);
		  u2 = getPayloadWindow() + 1;

		  optixTrace(optixLaunchParams.traversable,
			  surfPos + 1e-3f * rayDir,
			  rayDir,
			  0.f,    // tmin
			  1e20f,  // tmax
			  0.0f,   // rayTime
			  OptixVisibilityMask( 255 ),
			  OPTIX_RAY_FLAG_DISABLE_ANYHIT,//OPTIX_RAY_FLAG_NONE,
			  SURFACE_RAY_TYPE,             // SBT offset
			  RAY_TYPE_COUNT,               // SBT stride
			  SURFACE_RAY_TYPE,             // missSBTIndex 
			  u0, u1, u2);

		  prd = pixelColorPRD;
	  }
	  else
	  {

	  //temporary simple shading

	  const vec3f &A     = sbtData.vertex[index.x];
	  const vec3f &B     = sbtData.vertex[index.y];
	  const vec3f &C     = sbtData.vertex[index.z];
	  const vec3f Ng     = normalize(cross(B-A,C-A));

	  const float cosDN  = 0.5f + .5f*fabsf(dot(rayDir,Ng));

	  prd = cosDN * sbtData.color;
	  }
  }

Worth mentioning that I might in the future set the transparent geometry hitcount to something larger than 1.

Not sure what the renderer is doing elsewhere, but I would not program that recursively.

If you only ever trace a continuation ray when hitting a front face, you could also return to the ray generation program and use an iterative algorithm that way, by continuing the same ray with a different t_min or different origin, like you do now.

If you need to handle a larger amount of concave folds, the iterative way doesn’t need as much stack space. Actually an iterative approach would be the minimum stack size possible, which is why this should be always preferred over recursive algorithms on the GPU.

The closest hit program would only need to indicate that a path is not terminated and the distance from where to continue the next path segment in the same direction.
That could be combined into a single float distance on the payload. Negative means end of path, positive means step along the ray by that distance.
Initialize with negative value to handle the miss case, Then you might not even need a miss program.

Means you could implement this as path tracer which will produce complete images in one launch, so not a Monte Carlo one.
That will lead to the same number of rays per pixel as before so the performance impact comes from the longer tails for the paths which haven’t finished, yet.

My old OptiX 5.1.0 based introduction examples show how to arrive at a simple iterative path tracer step by step. I ported two of the later ones to OptiX 7.
Links to new and old versions here: https://github.com/NVIDIA/OptiX_Apps or in the sticky posts on this sub-forum.

1 Like

Thanks again droettger!

Implemented it like this in raygen:

	  prd->transCount = 0;
	  prd->distance = -1;
	  prd->color = optixLaunchParams.world.backgroundColor;
	  
	  uint32_t u0, u1;	
	  packPointer(prd, u0, u1);

	  for(;;)
	  {
		  optixTrace(
			  handle,
			  ray_origin,
			  ray_direction,
			  tmin,
			  tmax,
			  0.0f,                     // rayTime
			  OptixVisibilityMask(255),
			  OPTIX_RAY_FLAG_DISABLE_ANYHIT,
			  SURFACE_RAY_TYPE,        // SBT offset
			  RAY_TYPE_COUNT,           // SBT stride
			  SURFACE_RAY_TYPE,        // missSBTIndex
			  u0, u1);

		  if (prd->distance < 0 || prd->transCount > 0)
			  break;
		  else
		  {
			  prd->transCount++;
			  tmin = prd->distance + 1e-3f;
		  }
	  } 

and in closest hit:

  extern "C" __global__ void __closesthit__radiance()
  {
    	  // get generic constants
    	  const TriangleMeshSBTData &sbtData = *(const TriangleMeshSBTData*)optixGetSbtDataPointer();
    	  const int   primID = optixGetPrimitiveIndex();
    	  const vec3i index = sbtData.index[primID];

    	  RadiancePRD& prd = *(RadiancePRD*)getPRD<RadiancePRD>();

    	  // check if transparent

    	  if (sbtData.transparent && !prd.transCount > 0)
    	  {
    		  const vec3f rayOrig = optixGetWorldRayOrigin();

    		  const float u = optixGetTriangleBarycentrics().x;
    		  const float v = optixGetTriangleBarycentrics().y;

    		  const vec3f surfPos
    			  = (1.f - u - v) * sbtData.vertex[index.x]
    			  + u * sbtData.vertex[index.y]
    			  + v * sbtData.vertex[index.z];

    		  prd.distance = length(surfPos - rayOrig);
    	  }
    	  else
    	  {
    		  const vec3f rayDir = optixGetWorldRayDirection();
    		  
    		  //temporary simple shading

    		  const vec3f &A     = sbtData.vertex[index.x];
    		  const vec3f &B     = sbtData.vertex[index.y];
    		  const vec3f &C     = sbtData.vertex[index.z];
    		  const vec3f Ng     = normalize(cross(B-A,C-A));

    		  const float cosDN  = 0.5f + .5f*fabsf(dot(rayDir,Ng));

    		  prd.color = cosDN * sbtData.color;
    		  prd.hitIndex = sbtData.matIndex;
    		  prd.distance = -1;
          }
  }

performance gain is maybe negligible at the moment but cleaner code.

One last question, would there be a quicker way to get the distance from ray origin to the intersection point? I guess I could save the intersection point in the prd instead and just use that plus raydir * 1e-3f as new origin as you said earlier.

Reading up on the API I realized that the optixGetRayTmax() should get me the distance if called in a closet hit program.

Or you could have looked through my examples:
https://github.com/NVIDIA/OptiX_Apps/blob/master/apps/intro_runtime/shaders/closesthit.cu#L164
I need the traveled distance inside the ray generation program to compute volume absorption of the material the ray is currently inside (and in my renderers supporting volume scattering that is also used to sample the random walk distance.)

How big is your per ray payload?
If it’s only float3 color, int hitIndex, float distance then it’s faster to put these into five of the available eight 32-bit payload registers instead of taking the detour via the payload pointer which incurs slower memory accesses.

Sorry, will as I progress.

For now I think my values should fit within the payloads yes. Would it always be faster to avoid the pointer if possible? Even though I have to call float_to_int, int_to_float.

Yes, avoiding memory accesses is crucial for performance.

The conversions are not actually happening, That is is just a matter of reinterpreting a 32-bit register for the compiler.

Mind that the payload registers are defined as 32-bit unsigned int in OptiX, so the correct conversions are float_as_uint() and uint_as_float() or when only used in device code, the built-in versions __float_as_uint() and __uint_as_float() directly.

Looks like the OptiX SDK examples have consistently ignored that. I’ll have that fixed.

Thanks, will keep that in mind!

Also in this article (float_as_int(), int_as_float()): https://developer.nvidia.com/blog/how-to-get-started-with-optix-7/