Debugger error.

TheDonsky · December 29, 2016, 9:23am

I’m writing a simple ray tracer and encountered a strange issue while testing. For some reason, I decided to switch to debug mode(in VS 2015) and and the output picture became really glitchy and nothing like what I was getting in Release mode (Release build gives me a half-decent image, while Debug generated something that barely resembled the shapes it was supposed to and those with a lot of missing parts and no reflections). Is there any difference in floating point arithmetic or anything like that between the two modes?

The part of the code that’s probably is “misbehaving” is this:

template<typename HitType>
struct PixelRenderFrame {
	Photon photon;
	typename ShadedOctree<HitType>::RaycastHit hit;
	ColorRGB color;
	ShaderBounce bounce;
	int bounceId;
};
template<typename HitType, unsigned int MaxStackSize>
struct PixelRenderProcess {
	SceneDataHandles<HitType> world;
	PixelRenderFrame<HitType> stack[MaxStackSize];
	PixelRenderFrame<HitType> *end;
	PixelRenderFrame<HitType> *ptr;
	Pixel::PixelColor output;
	bool renderComplete;

	bool shadeStarted;
	bool midShade;
	Photon savedPhoton;
	int lightId;

	int castsLeft;

	__device__ __host__ inline bool shade() {
		PixelRenderFrame<HitType> &frame = (*ptr);
		
		if (!shadeStarted) {
			if (castsLeft <= 0) {
				midShade = true;
				return true;
			}
			else {
				midShade = false;
				shadeStarted = true;
				castsLeft--;
			}
			if (frame.photon.dead()) return false;
			if (!world.world->cast(frame.photon.ray, frame.hit, false)) return false;

			frame.color(0.0f, 0.0f, 0.0f);

			// BOUNCE FROM INDIRECT ILLUMINATION:
			if ((ptr + 1) != end) {
				ShaderBounceInfo<HitType> bounceInfo = { frame.hit.object.object, frame.photon, frame.hit.hitPoint };
				frame.hit.object.material->bounce(bounceInfo, &frame.bounce);
			}
			else frame.bounce.count = 0;
			frame.bounceId = 0;
		}

		// COLOR FROM LIGHT SOURCES:
		const Stacktor<Light> &lights = (*world.lights);
		for (int i = lightId; i < lights.size(); i++) {
			bool noShadows;
			Photon p;
			if (midShade) {
				noShadows = false;
				p = savedPhoton;
			}
			else {
				noShadows = false;
				p = lights[i].getPhoton(frame.hit.hitPoint, &noShadows);
				if (p.dead()) continue;
			}
			if (!noShadows) {
				if (castsLeft > 0) {
					typename ShadedOctree<HitType>::RaycastHit lightHit;
					if (world.world->cast(p.ray, lightHit, false)) {
						if ((frame.hit.hitPoint - lightHit.hitPoint).sqrMagnitude() <= 128.0f * VECTOR_EPSILON)
							noShadows = true;
					}
					else noShadows = true;
					midShade = false;
					castsLeft--;
				}
				else {
					midShade = true;
					savedPhoton = p;
					lightId = i;
					return true;
				}
			}
			if (noShadows) {
				ShaderHitInfo<HitType> castInfo = { frame.hit.object.object, p, frame.hit.hitPoint, frame.photon.ray.origin };
				frame.color += frame.hit.object.material->illuminate(castInfo).color;
			}
		}
		frame.color *= frame.photon.color;

		shadeStarted = false;
		midShade = false;
		lightId = 0;
		return true;
	}

	__device__ __host__ inline bool iterate() {
		while (true) {
			if (midShade || (ptr->bounceId < ptr->bounce.count)) {
				if (!midShade) {
					Photon sample = ptr->bounce.samples[ptr->bounceId];
					sample.ray.origin += sample.ray.direction * (128.0f * VECTOR_EPSILON);
					ptr->bounceId++;
					ptr++;
					ptr->photon = sample;
				}
				if (!shade()) {
					if (ptr != stack) ptr--;
					else {
						output.depth = -1;
						renderComplete = true;
						return true;
					}
				}
				else if (midShade) return false;
			}
			else if(ptr == stack) {
				output.color = ptr->color;
				output.depth = ptr->hit.hitDistance;
				renderComplete = true;
				return true;
			}
			else {
				ColorRGB col = ptr->color;
				ptr--;
				ptr->color += col;
			}
		}
	}

	__device__ __host__ inline void setup(const Photon &photon, const SceneDataHandles<HitType> &world) {
		this->world = world;
		end = (stack + MaxStackSize);
		ptr = stack;
		ptr->photon = photon;
		shadeStarted = false;
		midShade = true;
		lightId = 0;
		renderComplete = false;
		castsLeft = 0;
	}
};

// Code below is temporary and will be changed in the future
template<typename HitType, unsigned int MaxStackSize>
__device__ __host__ inline static Pixel::PixelColor renderPixel(const Photon &photon, const SceneDataHandles<HitType> &world, PixelRenderProcess<HitType, MaxStackSize> *stack) {
	PixelRenderProcess<HitType, MaxStackSize> pixel;
	pixel.setup(photon, world);
	while (!pixel.renderComplete) {
		pixel.castsLeft = 1; // this variable will be used to limit raycasts per kernel in the future, to prevent crashes.
		pixel.iterate();
	}
	return pixel.output;
}

njuffa · December 29, 2016, 9:34am

There can be numerical differences between debug and release mode builds. For example, the contraction of floating-point multiplies and adds into FMA (fused multiply-add) instructions is an optimization. In debug mode all optimizations are turned off. Use of the contraction can often change numerical results since use of FMA reduces rounding errors and preserves bits in cases of subtractive cancellation, leading to more accurate results.

What you describe doesn’t sound like a consequence of numerical differences, though, it’s more consistent with race conditions in the code that get exposed due to different code timing.

When you run your code under control of cuda-memcheck, are any issues reported? Note that cuda-memcheck cannot find all race conditions.

TheDonsky · December 29, 2016, 9:46am

Thanks for the information about FMA. I had some problems with floats being not accurate enough some time ago and didn’t know about optimizations like that.
I don’t think the problem has to do with race conditions, because every variable that is shared between threads remains intact during the whole process and each thread has it’s own instance of the struct mentioned above on stack and each updates only one pixel(at least for now).
For comparison, you can see the image I attached to the post. the one on the left is the debug mode output and the one that shows the image properly is my Production build.

TheDonsky · December 29, 2016, 10:06am

I checked with cuda-memcheck and it didn’t detect any errors in Release build, while Debug simply crashed after reaching the timeout (that 5 or so second limit for kernels; since my test renders over and over again before I turn the window off, changing the timeout won’t give me a chance to test anyway). We may assume, it’s not a memory issue.

Update:
Changed the kernel size and Debug test didn’t crash this time when run under cuda-memcheck. No errors detected, just as expected and the image still looks terrible.
I’ll simply settle down with the idea, that is’s a numerical issue (Octree<> class, that’s responsible for raycasts uses functions, that allow a really small error(0.0.000005f) in most of the calculations and likely, the debug mode’s lack of optimizations causes cast() to fail and produce random results. But this still seems strange, considering the test of the Octree itself, that once again renders some things(without lights and/or reflections/refractions) works just fine in both modes).

Topic		Replies	Views
Release and Debug modes on CUDA 5.0 CUDA Programming and Performance	2	1050	January 9, 2014
Different results in Debug and Release mode compile CUDA Programming and Performance	9	4230	October 20, 2015
Strange behaviour of Cuda release build CUDA Programming and Performance	6	616	November 3, 2021
(Yet again) difference between emudebug and debug shot in the dark CUDA Programming and Performance	10	2636	September 14, 2008
Why does Debug & Release get different result CUDA Programming and Performance	14	9821	May 11, 2013
Weird result difference between release and debug even with -fmad=false CUDA Programming and Performance cuda	8	755	June 30, 2022
Why is the result different when I compile with release version CUDA Programming and Performance	6	617	November 24, 2020
emulation mode may give us a wrong result? CUDA Programming and Performance	2	947	May 16, 2009
How to debug a program that only bugs in release mode? Debug and emu do not show the problem at all CUDA Programming and Performance	10	9582	April 7, 2010
Floating-point precision problems CUDA Programming and Performance	14	4613	January 7, 2011

Debugger error.

Related topics