Too bright results when importance sampling the GGX NDF

This is a crosspost of my post on computergraphics.stackexchange as it isn’t getting much visibility there.

I’ve been trying to importance sample the GGX NDF of my Cook-Torrance BRDF for some time now but I can’t get it right.

I’ve been following this article.

Here’s my code for the importance sampling part. It samples a direction and returns it. It also computes the corresponding probability for that direction and stores it in the pdf argument of cook_torrance_sample_direction.

inline vec3f __device__ cook_torrance_sample_direction(PerRayData& prd, vec3f& view_direction, vec3f& surface_normal, float roughness, float& pdf)
{
	float roughness2 = roughness * roughness;

	//rand_theta in [0.0, 1.0]
	float rand_theta = prd.random();

	float theta = atan(roughness * sqrtf(rand_theta / (1.0f - rand_theta)));
	float phi = prd.random() * 2.0f * M_PI;//Isotropic case

	float cos_theta = cos(theta);
	float sin_theta = sin(theta);
	float cos_phi = cos(phi);
	float sin_phi = sin(phi);

	//+0.00001f to avoid dividing by 0
	float denom = (roughness2 - 1.0f) * cos_theta * cos_theta + 1.0f + 0.00001;
	float pdf_half_vector = (2 * roughness2 * cos_theta * sin_theta) / (denom * denom);

	vec3f sampled_normal_local = vec3f(cos_phi * sin_theta,
									   sin_phi * sin_theta,
									   cos_theta);

	vec3f view_direction_local = world_to_local(surface_normal, view_direction);

	//Section "One Extra Step" of the article
	pdf = pdf_half_vector / (4 * dot(view_direction_local, sampled_normal_local) + 0.00001f);

	vec3f surface_normal_local = vec3f(0.0f, 0.0f, 1.0f);
	vec3f reflected_direction_local = normalize(-view_direction_local - 2 * dot(-view_direction_local, sampled_normal_local) * sampled_normal_local);
	vec3f reflected_dir_world = local_to_world(surface_normal, reflected_direction_local);

	return reflected_dir_world;
}

The sampled direction is then passed to the BRDF evaluation function (as the light_direction argument) which returns a color. I then divide this color by the PDF:

inline vec3f __device__ cook_torrance_brdf(const CookTorranceMaterial& material, const vec3f& view_dir, const vec3f& light_direction, const vec3f& normal)
{
	vec3f halfway_vector = normalize(view_dir + light_direction);

	float NoV = max(dot(normal, view_dir),            0.0f);
	float NoL = max(dot(normal, light_direction),  0.0f);
	float NoH = max(dot(normal, halfway_vector),      0.0f);
	float VoH = max(dot(view_dir, halfway_vector),    0.0f);

	vec3f F0 = (0.16f * (material.reflectance * material.reflectance));
	//F0 for metals is equal to the albedo.
	//We're going to lerp between the albedo and 0.04
	//(previous value of F0) depending on
	//the metalness of the material. A fully metalic material
	//will thus have a F0 equal to its albedo
	F0 = (1.0f - material.metallic) * F0 + material.metallic * material.albedo;

	//Fresnel: reflected portion of the light (1 - transmitted)
	vec3f F = schlick_approximation(VoH, F0);
	float NDF = GGX_NDF(NoH, material.roughness);
	float G = geometry_Smith(NoV, NoL, material.roughness);

	vec3f kS = F;
	vec3f kD = vec3f(1.0f) - kS;
	kD *= 1.0 - material.metallic;

	vec3f numerator = NDF * G * F;
	//+0.0001f to avoid dividng by zero if the dot products are 0
	float denominator = 4.0f * NoV * NoL + 0.0001f;
	vec3f specular = numerator / denominator;

	// add to outgoing radiance Lo
	return (kD * material.albedo / (float)M_PI + specular);
}

inline vec3f __device__ schlick_approximation(float cos_theta, const vec3f& F0)
{
	return F0 + (1.0f - F0) * powf((1.0f - cos_theta), 5.0f);
}

inline float __device__ GGX_NDF(float NoH, float roughness)
{
	float alpha = roughness * roughness;
	float alpha2 = alpha * alpha;
	float NoH2 = NoH * NoH;

	float num = alpha2;
	float denom = (NoH2 * (alpha2 - 1.0f) + 1.0f);
	denom = (float)M_PI * denom * denom;

	return num / denom;
}

inline float __device__ geometry_Schlick_GGX(float NoX, float roughness)
{
	float r = (roughness + 1.0);
	float k = (r*r) / 8.0;

	float num = NoX;
	float denom = NoX * (1.0 - k) + k;

	return num / denom;
}

inline float __device__ geometry_Smith(float NoV, float NoL, float roughness)
{
	float ggx2 = geometry_Schlick_GGX(NoV, roughness);
	float ggx1 = geometry_Schlick_GGX(NoL, roughness);

	return ggx1 * ggx2;
}

However, this is giving me results brighter than expected for low roughness values. Here is a roughness of 0.05:

And for roughness lower than 0.013, black pixels start appearing (and eventually cover the entire image):

I noticed that for the too-bright case of the roughness = 0.05, my evaluation function returns color vectors whose RGB values are over 200 (sometimes way more than that, reaching thousands). I suppose this is okay and expected as long as the PDF counter balances it (which it doesn’t because in those cases, the value of the PDF is only between ~5.0 and ~15.0)?

Let’s address the two symptoms you describe first:

When a BSDF looks too bright although it’s not having any emission that is usually an indication for a problem with the pdf value of either the light or the BSDF.

I usually implement unidirectional path tracers with direct lighting (next event estimation) and multiple importance sampling.
The standard test if things are correct, is to also implement a code path without direct lighting, meaning a brute force path tracer which only works with area lights, and if the two different code paths do not converge to the same result in the accumulated image, something is usually wrong with the light pdf value.

Other standard tests are the white furnace test, a sphere with a white material inside a white (1.0f bright) spherical environment. When changing the tonemapper, the white sphere must remain invisible or something is either generating or absorbing light, which microfacet models without multiscatter compensation often do.

The third standard test for lights is a sphere with a fully transparent white material, index of refraction 1.0 and no volume absorption or scattering over some diffuse grey ground plane. That sphere must remain invisible. No shadow must appear on the ground plane from the light transport algorithm or again the light pdf is incorrect.

The other symptom you describe with pixels getting black and never converge against the correct color is either Nan values which never go away or negative values.

You would need to check if any of the resulting color components are NaN or negative (or infinite) and find out where exactly these values come from.
I debug such cases with ultra-high primary colors to not be averaged away in an accumulating path tracer:
https://github.com/NVIDIA/OptiX_Apps/blob/master/apps/rtigo10/shaders/raygeneration.cu#L285

The GGX microfacet model is not working below a certain small roughness threshold, when it gets too smooth, it becomes numerically instable. Maybe that is the case in your implementation as well and you’d need to clamp the roughness value resp. switch to a specular reflection when you’re falling below these roughness values.
The example code below supports anisotropic roughness and clamps it:
https://github.com/NVIDIA/OptiX_Apps/blob/master/apps/rtigo10/src/Application.cpp#L854

I don’t have an example which is using Fresnel on the microfacet distribution but there are multiple examples for GGX with Smith shadowing term in my OptiX 7 examples and the MDL-SDK libbsdf code.
Maybe have a look at these two implementations.

https://github.com/NVIDIA/OptiX_Apps/blob/master/apps/rtigo10/shaders/brdf_ggx_smith.cu

https://github.com/NVIDIA/MDL-SDK/blob/master/src/mdl/jit/libbsdf/libbsdf.cpp#L3018

2 Likes