Optix denoiser implementation exhibits black square artifacts

I integrated Optix denoiser inside our path tracer using Optix 8.0.0 and Cuda 12.6 . While I get decent results with some scenes, I get black square artefacts on some other scenes after several path tracing samples.

Please see this video for a demo description: https://youtu.be/AGtOWDGt17M

At 100 samples per pixel(glitch is already there) gbuffer/aovs looks like this:
https://www.dropbox.com/scl/fi/j94a6mnib7ddcgugb2wmq/gBuffer.zip?rlkey=x4b1985pp276hmc6c9t5la6f0&st=6y9e3jp3&dl=0
Meaning the denoiser input is good.

Workflow:

  1. Rendering start: we create a new denoiser
  2. Everytime we render a new frame(by cumulative moving average) we feed it to the optix denoiser and its output becomes the current render.

What am I missing? Our implementation is straightforward. We use the unmodified denoiser wrapper from the toolkit(OptiXDenoiser.h).

HEADER
include “OptiXDenoiser.h”//SDK wrapper

struct NvidiaOptixDenoiser
{
	NvidiaOptixDenoiser();
	~NvidiaOptixDenoiser();

	void denoise(Graphics::Texture::Image& imageOut, const DenoisingArgs& denoisingArgs);

private:
	OptiXDenoiser* m_denoiser = nullptr;
	nbBool m_firstFrame = true;
};

SIMPLIFIED CPP: To test the issue we made a test branch that uses RGBA textures instead of RGB, since this is what the denoiser wrapper needs. Unfortunately it has the square artifacts glitch as well.

NvidiaOptixDenoiser::NvidiaOptixDenoiser()
{
    m_denoiser = new OptiXDenoiser();
}

NvidiaOptixDenoiser::~NvidiaOptixDenoiser()
{
    m_denoiser->finish();
    delete m_denoiser;
}

void NvidiaOptixDenoiser::denoise(Graphics::Texture::Image& imageOut, const DenoisingArgs& denoisingArgs)
{
	ASSERT(imageOut.getFormat() == ImageFormat::RGBA32F);
	const Graphics::GBuffer* gBuffer = denoisingArgs.gBuffer;
	ASSERT(gBuffer);

	const Uint32 width = imageOut.getWidth();
	const Uint32 height = imageOut.getHeight();

	//-----------------------------------------------------------------------------------------------------------------------------------------------------
	// Build the Optix input data
	//-----------------------------------------------------------------------------------------------------------------------------------------------------

	OptiXDenoiser::Data optixDenoisingData;
	optixDenoisingData.width = width;
	optixDenoisingData.height = height;
	optixDenoisingData.color = reinterpret_cast<const float*>(gBuffer->m_radiance->getRawData());
	optixDenoisingData.normal = reinterpret_cast<const float*>(gBuffer->m_normals->getRawData());
	optixDenoisingData.albedo = reinterpret_cast<const float*>(gBuffer->m_albedo->getRawData());
	optixDenoisingData.flow = reinterpret_cast<const float*>(gBuffer->m_motionVectors->getRawData());

	optixDenoisingData.outputs.push_back(reinterpret_cast<float*>(imageOut.getMutableRawData()));

	//-----------------------------------------------------------------------------------------------------------------------------------------------------
	// Perform denoising
	//-----------------------------------------------------------------------------------------------------------------------------------------------------
	if (m_firstFrame)
	{
		m_denoiser->init(optixDenoisingData, 0, 0, false, true, false, false, 0, false);
		m_firstFrame = false;
	}
	else
	{
		m_denoiser->update(optixDenoisingData);
	}

	m_denoiser->exec();
	m_denoiser->getResults();
}

PRODUCTION CPP: The CPP of our main branch. We use RGB textures. Without any surprise the black squares issue is still there.

NvidiaOptixDenoiser::NvidiaOptixDenoiser()
{
    m_denoiser = new OptiXDenoiser();
}

NvidiaOptixDenoiser::~NvidiaOptixDenoiser()
{
    m_denoiser->finish();
    delete m_denoiser;
}

void NvidiaOptixDenoiser::denoise(Graphics::Texture::Image& imageOut, const DenoisingArgs& denoisingArgs)
{
    ASSERT(imageOut.getFormat() == ImageFormat::RGB32F);
    RGB32FImage* imageRGBFOut = dynamic_cast<RGB32FImage*>(&imageOut);
    ASSERT(imageRGBFOut);

    const Graphics::GBuffer* gBuffer = denoisingArgs.gBuffer;
    ASSERT(gBuffer);

    const std::unique_ptr<Graphics::Texture::RGB32FImage>& imageIn = gBuffer->m_radiance;

    const Uint32 width = imageOut.getWidth();
    const Uint32 height = imageOut.getHeight();

    //-----------------------------------------------------------------------------------------------------------------------------------------------------
    // Build the input images. Optix Denoiser only support FLOAT4. Maybe just a limitation of the wrapper?
    //-----------------------------------------------------------------------------------------------------------------------------------------------------
    RGBA32FImage optixSrcColor(width, height);
    RGBA32FImage optixNormals(width, height);
    RGBA32FImage optixAlbedo(width, height);
    RGBA32FImage optixFlow(width, height);
    const Uint32 optixNbPixels = width * height;

    tbb::parallel_for(size_t(0), size_t(optixNbPixels), [&](size_t tbbIdx) {
	    const Uint32 pixelIdx = (nbUint32)tbbIdx;

	    const Uint32 pixelPosX = (nbUint32)(pixelIdx % width);
	    const Uint32 pixelPosY = (nbUint32)(pixelIdx / width);
	    const Math::Uvec2 pixelPos = Math::Uvec2(pixelPosX, pixelPosY);

	    {
		    // In color.
		    {
			    const RGBFColor color = imageIn->getPixelFromPosition(pixelPos);
			    optixSrcColor.setPixelFromPosition(RGBAFColor(color.x, color.y, color.z, 0.0f), pixelPos);
		    }

		    // Normals.
		    {
			    const RGBFColor normal = gBuffer->m_normals->getPixelFromPosition(pixelPos);
			    optixNormals.setPixelFromPosition(RGBAFColor(normal.x, normal.y, normal.z, 0.0f), pixelPos);
		    }

		    // Albedo.
		    {
			    const RGBFColor albedo = gBuffer->m_albedo->getPixelFromPosition(pixelPos);
			    optixAlbedo.setPixelFromPosition(RGBAFColor(albedo.x, albedo.y, albedo.z, 0.0f), pixelPos);
		    }

		    // In flow. They are motion vectors
		    {
			    const RGBFColor flow = denoisingArgs.gBuffer->m_motionVectors->getPixelFromPosition(pixelPos);
			    optixFlow.setPixelFromPosition(RGBAFColor(flow.x, flow.y, flow.z, 0.0f), pixelPos);
		    }
	    }
    });

    //-----------------------------------------------------------------------------------------------------------------------------------------------------
    // Build the Optix input data
    //-----------------------------------------------------------------------------------------------------------------------------------------------------

    RGBA32FImage optixDestColor(width, height);

    OptiXDenoiser::Data optixDenoisingData;
    optixDenoisingData.width = width;
    optixDenoisingData.height = height;
    optixDenoisingData.color = reinterpret_cast<const float*>(optixSrcColor.getRawData());
    optixDenoisingData.normal = reinterpret_cast<const float*>(optixNormals.getRawData());
    optixDenoisingData.albedo = reinterpret_cast<const float*>(optixAlbedo.getRawData());
    optixDenoisingData.flow = reinterpret_cast<const float*>(optixFlow.getRawData());

    optixDenoisingData.outputs.push_back(reinterpret_cast<float*>(optixDestColor.getMutableRawData()));

    //-----------------------------------------------------------------------------------------------------------------------------------------------------
    // Perform denoising
    //-----------------------------------------------------------------------------------------------------------------------------------------------------
    if (m_firstFrame)
    {
	    m_denoiser->init(optixDenoisingData, 0, 0, false, true, false, false, 0, false);
	    m_firstFrame = false;
    }
    else
    {
	    m_denoiser->update(optixDenoisingData);
    }

    m_denoiser->exec();
    m_denoiser->getResults();

    //-----------------------------------------------------------------------------------------------------------------------------------------------------
    // Read back results
    //-----------------------------------------------------------------------------------------------------------------------------------------------------
    tbb::parallel_for(size_t(0), size_t(optixNbPixels), [&](size_t tbbIdx) {
	    const Uint32 pixelIdx = (nbUint32)tbbIdx;

	    const Uint32 pixelPosX = (nbUint32)(pixelIdx % width);
	    const Uint32 pixelPosY = (nbUint32)(pixelIdx / width);
	    const Math::Uvec2 pixelPos = Math::Uvec2(pixelPosX, pixelPosY);

	    const RGBAFColor color = optixDestColor.getPixelFromPosition(pixelPos);
	    imageRGBFOut->setPixelFromPosition(RGBFColor(color.x, color.y, color.z), pixelPos);
    });
}

We can provide executables as well as the test scenes if it can help us solve the issue :)
The problem is there on both an RTX 3080 and an RTX 3060
Thank you very much for helping!

Hi @studentechamp, is it buggy even when changing the order of operations? I’ve watched the video: first load Sponza, denoise=off, then denoise=on, then load the dragon, is denoise on or off? Maybe the bool flag is not what it should be when loading the dragon and things aren’t properly initialised and/or cleared. For debugging pls remove the parallel for, just in case.

I can repro the problem. Without normals.exr the result is bug free. The normal map is bluish, so that’s tangent space I suppose. It must be in camera space (red-green colors) pls see optix denoiser input buffer questions - Visualization / OptiX - NVIDIA Developer Forums and How to use / implement normal maps - #2 by dhart

Thanks for helping :). I removed normals usage and thats fixes the issue. But there is a non negligeable loss of quality in Sponza scene.

With normals.

Without normals.

Especially in his area:

I am currently using world space normals. So I tried the following.

  1. Convert the normal from worldspace to view space:

    const Math::Vec3 normal = (worldToView * Math::Vec4(worldNormal.x, worldNormal.y, worldNormal.z, 0.0f)).xyz();

Doesn’t solve the issue. With this normals look like this:

  1. Trying to use the inverse camera matrix (viewToWorld) as suggested here:

    const Math::Vec3 normal = (viewToWorld* Math::Vec4(worldNormal.x, worldNormal.y, worldNormal.z, 0.0f)).xyz();

Doesn’t solve the issue either. Normals look like this:

My engine uses a right handed coordinate system so it should be fine.
Thank you !

Right now I don’t know what the problem is, might be wrong image format:

  • For camera space the input image has only X,Y components, not X,Y,Z.
  • For world space the image must have X,Y,Z.

You can also try normals in world space since OptiX 8, see the release notes OptiX SDK Release Notes 8.0.1 but you need OPTIX_DENOISER_MODEL_KIND_AOV.

  1. “For camera space the input image has only X,Y components, not X,Y,Z.”

Not sure to understand. How would camera normals only have 2 dimensions?

  1. “For world space the image must have X,Y,Z.”

This is already what i do. I also tried OPTIX_DENOISER_MODEL_KIND_AOV by setting the kpMode to true at denoiser initialization:

	if (m_firstFrame)
	{
		const float kpMode = true; // HERE
		m_denoiser->init(optixDenoisingData, 0, 0, kpMode, true, false, false, 0, false);
		m_firstFrame = false;
	}
	else
	{
		m_denoiser->update(optixDenoisingData);
	}

This gives even worse results: https://youtu.be/Zkz84_-EqvQ

Maybe it is just an Optix 8 bug? I can provide Release/Debug executables as well as test scenes if it can help solve this :)

Hi @studentechamp,

I’ve asked the OptiX denoiser engineer to take a look, but he’s on vacation so it may take a few days.

Can you repro this using the optixDenoiser command line SDK sample? This would mean saving your inputs, both image and normals, to separate files, and generating the output via command line. You can pass normals using the optional -n flag.

If this reproduces using the command line, then we can triage using only the input images, and we won’t need either code or executables.

–
David.

Hey BTW I’m looking at the code in the first comment and not sure I understand. Where does the Graphics::GBuffer class come from? Does that handle copying the image buffer to the GPU? Or are you using unified or shared memory?

Have you checked whether this is some kind of synchronization problem? Try using env var CUDA_LAUNCH_BLOCKING=1 or try adding a cudaDeviceSynchronize() immediately both before and after running the denoiser update.

–
David.

That was related to the OptiX version back in 2018.

Here is a newer information for OptiX 8 from last year: [OptiX 8.0] Denoiser: Camera space vs World space? - #2 by droettger

You could use “World Space Normals”. In my OptiX8-based raytracer (using CUDA 12.2.2) they work very well; I use the local normals in object space, which I then convert into world space this way:

// vertex normals of the triangle
float3 n0 = normal_buffer[ idx0 ]; 
float3 n1 = normal_buffer[ idx1 ];
float3 n2 = normal_buffer[ idx2 ];

// apply barycentrics
float3 local_normal = normalize( n1*beta + n2*gamma + n0*alpha ); 


float3 denoiser_normal = normalize( optixTransformNormalFromObjectToWorldSpace(local_normal) );

And as shown here, I also invert the normal, when the ray hits a backface.

m001 Thank you very much for helping. I also negate the normals when the ray hits a backface. Still same as you i use local normals then convert them to worldspace when needed:

inline Vertex Mesh::buildTransformedVertexFromIndex(Uint32 idx) const
{
	const auto& indices = getIndices();
	ASSERT(idx < indices.size());

	return buildTransformedVertex(
		indices[idx]
	);
}

inline Vertex Mesh::buildTransformedVertex(Uint32 idx) const
{
	const auto& vertices = getVertices();
	ASSERT(idx < vertices.size());

	Vertex vtx = vertices[idx];

	const auto& transform = getTransform();
	vtx.position = transform->transformPoint(vtx.position);
	vtx.normal = glm::normalize(transform->transformDirection(vtx.normal));
	vtx.tangent = glm::normalize(transform->transformDirection(vtx.tangent));
	vtx.bitangent = glm::normalize(transform->transformDirection(vtx.bitangent));

	return vtx;
}

The most annoying thing is that the issue doesnt happen with all scenes.
When an intersection occurs i compute the intersection properties(Thats include the normal!)
The buildTransformedVertexFromIndex from previous is used here :)

IntersectionProperties buildIntersectionProperties(const Math::Ray& ray, const Intersector::IntersectionInfo& info, const Scene::BaseScene* scene)
{
	const auto mesh = info.object;
	const auto P = ray.getPoint(info.meshIntersectData.t);

	const Uint32 triStartIdx = info.meshIntersectData.primId * 3u;

	const auto v1 = mesh->buildTransformedVertexFromIndex(triStartIdx);
	const auto v2 = mesh->buildTransformedVertexFromIndex(triStartIdx + 1);
	const auto v3 = mesh->buildTransformedVertexFromIndex(triStartIdx + 2);

	Float32 area = 0.0f;
	{
		const Math::Vec3 e2 = v2.position - v1.position;
		const Math::Vec3 e3 = v3.position - v1.position;

		area = 0.5f * glm::length(glm::cross(e2, e3));
		area = glm::max(area, 1e-10f);
	}

	const Math::Vec3 coefs = Math::interpolate(v1.position, v2.position, v3.position, P, area);

	// Read material
	const Model::ModelPtr& model = scene->getModel();
	const Material::BaseMaterial* material = model->fastGetMaterialRawPtr_FromEntityOrDefault(mesh->getMaterialId());

	// Texture coordinates
	const Math::Vec2 texCoord = material->interpolateTexCoordinates(v1.texCoord, v2.texCoord, v3.texCoord,  coefs);

	// Eye vector
	const Math::Vec3 V = -ray.getDirection();

	// Compute normal
	Math::Vec3 N = (v1.normal * coefs.x) + (v2.normal * coefs.y) + (v3.normal * coefs.z);

	// Tangent and bitangent
	const Math::Vec3 T = (v1.tangent * coefs.x) + (v2.tangent * coefs.y) + (v3.tangent * coefs.z);
	const Math::Vec3 B = (v1.bitangent * coefs.x) + (v2.bitangent * coefs.y) + (v3.bitangent * coefs.z);

	// Apply normal mapping
	if (material->isFresnelMaterial())
	{
		const auto* fresnelMat = static_cast<const Material::FresnelMaterial*>(material);
		const EntityIdentifier normalMapId = fresnelMat->getNormalImageId();
		
		if (normalMapId)
		{
			// Read bump map
			const auto image = Texture::fastGetRGBAImageRawPtr_FromEntity(normalMapId);
			if (image)
			{
				const RGBAFColor bumpMapNormal = image->getNormalizedPixelFromRatio(texCoord) * 2.0f - 1.0f;
				const Math::Mat3 tbn = Math::Mat3(T, B, N);

				// Bump mapped normal
				N = tbn * glm::swizzle<glm::X, glm::Y, glm::Z>(bumpMapNormal);
			}
		}
	}

	// Finalize normal
	N = glm::normalize(N);

	if (glm::dot(N, V) < 0.0f)
		N *= -1;

	IntersectionProperties props;
	props.P = P;
	props.deltaP = getOffsetedPositionInDirection(P, N, scene->getCurrentRenderSettings().m_rayEpsilon);
	props.inDeltaP = getOffsetedPositionInDirection(P, -N, scene->getCurrentRenderSettings().m_rayEpsilon);
	props.V = V;
	props.texCoord = texCoord;
	props.BsdfProps.N = N;
	props.BsdfProps.T = T;
	props.BsdfProps.B = B;

	return props;
}

Hi @dhart , Thanks for helping it is appreciated.

Here some files when using world space normals. Glitch start at sample 3
https://www.dropbox.com/scl/fi/ept50ivxwhk1930bqt7dk/Using_WorldSpaceNormals.zip?rlkey=exorzuk8z9vkn5mv1sqbtps3n&st=do9f12v5&dl=0

Here some files when using camera space normals. Glitch start at sample 9
https://www.dropbox.com/scl/fi/ut725evm2uf0mng9vn4we/Using_CameraSpaceNormals.zip?rlkey=2dbt4pw4majthr0bdbmkuir9n&st=05m9nluh&dl=0

Tried :) Doesnt solve the issue

The gBuffer is just some outputs from the path tracer

  1. Radiance: Radiance computed by path tracing
  2. Normals: The primary intersection normals
  3. Albedo: The primary intersection albedo
  4. Motion vectors: Unused for now so just Black

My path tracer runs on the CPU. I simply use the denoiser wrapper from the Sample. So everytime denoising must be done the needed textures are uploaded to GPU. From Sample:

void OptiXDenoiser::update(const Data& data)
{
	ASSERT(data.color);
	ASSERT(data.outputs.size() >= 1);
	ASSERT(data.width);
	ASSERT(data.height);
	ASSERT(!data.normal || data.albedo);

	m_host_outputs = data.outputs;

	CUDA_CHECK(cudaMemcpy((void*)m_layers[0].input.data, data.color, data.width * data.height * sizeof(float4), cudaMemcpyHostToDevice));

	if (m_temporalMode)
		CUDA_CHECK(cudaMemcpy((void*)m_guideLayer.flow.data, data.flow, data.width * data.height * sizeof(float4), cudaMemcpyHostToDevice));

	if (data.albedo)
		CUDA_CHECK(cudaMemcpy((void*)m_guideLayer.albedo.data, data.albedo, data.width * data.height * sizeof(float4), cudaMemcpyHostToDevice));

	if (data.normal)
		CUDA_CHECK(cudaMemcpy((void*)m_guideLayer.normal.data, data.normal, data.width * data.height * sizeof(float4), cudaMemcpyHostToDevice));

	if (data.flowtrust)
		CUDA_CHECK(cudaMemcpy((void*)m_guideLayer.flowTrustworthiness.data, data.flowtrust, data.width * data.height * sizeof(float4), cudaMemcpyHostToDevice));

	for (size_t i = 0; i < data.aovs.size(); i++)
		CUDA_CHECK(cudaMemcpy((void*)m_layers[1 + i].input.data, data.aovs[i], data.width * data.height * sizeof(float4), cudaMemcpyHostToDevice));

	if (m_temporalMode)
	{
		OptixImage2D temp = m_guideLayer.previousOutputInternalGuideLayer;
		m_guideLayer.previousOutputInternalGuideLayer = m_guideLayer.outputInternalGuideLayer;
		m_guideLayer.outputInternalGuideLayer = temp;

		for (size_t i = 0; i < m_layers.size(); i++)
		{
			temp = m_layers[i].previousOutput;
			m_layers[i].previousOutput = m_layers[i].output;
			m_layers[i].output = temp;
		}
	}
	m_params.temporalModeUsePreviousLayers = 1;
}

The camera space normals are 3D vectors but z is constant (-1) hence removed.

OptiX 8.0.0’s optixDenoiser sample with the bluish normals and albedo in AOV mode gives no black squares.

The diff between using normals and not using them is faint but not zero. Pics attached.
denoised-diff-aov-with-normals-minus-aov-albedo-only.zip (119.9 KB)

Also read this OptiX 7.7 Black tile blinking in the bottom left corner when using the denoiser - #3 by droettger

So you are saying you do not have the issue with the files I supplied?
You sent image of a single frame. Are you denoising with temporal mode enabled?

  1. As you suggested I read this : OptiX 7.7 Black tile blinking in the bottom left corner when using the denoiser - #3 by droettger
    On my side OptixDenoiserParams is already initialized by the denoiser.

     //
     // Setup denoiser
     //
     {
     	OPTIX_CHECK(optixDenoiserSetup(
     		m_denoiser,
     		nullptr,  // CUDA stream
     		m_tileWidth + 2 * m_overlap,
     		m_tileHeight + 2 * m_overlap,
     		m_state,
     		m_state_size,
     		m_scratch,
     		m_scratch_size
     	));
    
     	// HERE!
     	m_params.hdrIntensity = m_intensity;
     	m_params.hdrAverageColor = m_avgColor;
     	m_params.blendFactor = 0.0f;
     	m_params.temporalModeUsePreviousLayers = 0;
     }
    
  2. You said “OptiX 8.0.0’s optixDenoiser sample with the bluish normals and albedo in AOV mode gives no black squares.” Is the issue here in non AOV mode? If yes how do you enable AOV mode?
    I tried to set kpMode to true at initialization but doesnt solve anything

     void OptiXDenoiser::init(const Data&  data,
     	unsigned int tileWidth,
     	unsigned int tileHeight,
     	bool         kpMode,
     	bool         temporalMode,
     	bool         applyFlowMode,
     	bool         upscale2xMode,
     	unsigned int alphaMode,
     	bool         specularMode)
    
  3. Please what is the command you fed to optixDenoiser.exe?
    Gonna give it a try on my side and put breakpoints in Visual Studio.

Thank you! <3

Did you try it out without applying the normal mapping?
Try this:

Both of course transformed into world space normals.

The problem is that you have NaN values in your normals images.

oiiotool --stats Normal_3.exr
Normal_3.exr         : 1080 x  720, 3 channel, float openexr
    Stats Min: -0.999991 -0.974394 -0.999453 (float)
    Stats Max: 0.413637 0.999651 0.989695 (float)
    Stats Avg: -0.187715 0.063366 -0.002677 (float)
    Stats StdDev: 0.340781 0.220970 0.234004 (float)
*** Stats NanCount: 1 1 1 ***
    Stats InfCount: 0 0 0 
    Stats FiniteCount: 777599 777599 777599 
    Constant: No
    Monochrome: No

OpenImageIO has a tool for fixing NaNs (oiiotool --fixnan), and if you remove them from the normals image, then the black squares issue will go away.

I was also curious why the normals image has speckles. I can see why it might have noise due to pixel/ray jitter, but some pixels seem much brighter than they should be given the surrounding neighbors. Perhaps there’s a more general issue in your normals calculations?

–
David.

2 Likes

Filtering out the NANs solve it. So it is not an Optix bug but mine hehee :)
Thank you very much for helping!

I couldnt find an obvious reason it may happen on our side. Willl need to do a proper investigation

1 Like

Hi @studentechamp cmd line is:

optixDenoiser.exe -A albedo.exr -A normals.exr radiance.exr

the capital A is important, it activates the AOV mode, otherwise it’s OPTIX_DENOISER_MODEL_KIND_HDR and the bad squares appear.

Temporal mode instead:

optixDenoiser.exe -A albedo.exr -A normals.exr -f motion.exr radiance.exr and forcing temporalMode = true produces this image:

(The zip file I attached in my previous post contains three images I believe).

For completeness: I’ve tested the sequence of the first 0…9 images: in world space the black square appears at frame 3 (and two squares at frame 6), in camera space at frame 9.

Command line:

optixdenoiser.exe -o denoised-ws-+.exr -F 0-9 -n worldspace\Normals\Normal+.exr -a worldspace\Albedo\Albedo_+.exr -f worldspace\Flow\Flow_+.exr worldspace\Beauty\Beauty_+.exr