Hi guys!
Some month ago I started working on a fractal raytracer in C++/OpenCL (https://synthverse.wordpress.com/). I already met a lot of bugs in the NVIDIA OpenCL compiler (access violations when declaring variables without using them) but I was Always able to find some workaround.
This time it looks a bit more serious: In my raytracer I have to select the color of the nearest shape. I implemented this using a switch, but I always get a CL_OUT_OF_RESOURCES error when reading the output buffer (CL_INVALID_COMMAND_QUEUE if I call clFinish() before). This happens only with NVIDIA GPUs, works correctly with AMD GPU and Intel CPU. This is the important part of the code:
TracerOut Trace(CameraOut in, SceneParams params)
{
float dist[2];
TracerOut out;
Mandelbulb1_OO Mandelbulb1_oo = Mandelbulb1_Object(in, params);
dist[0] = distance(Mandelbulb1_oo.intersection, in.origin);
Mandelbulb2_OO Mandelbulb2_oo = Mandelbulb2_Object(in, params);
dist[1] = distance(Mandelbulb2_oo.intersection, in.origin);
uint nearestId = 0;
float nearestDist = 10000000.0f;
for (uint i = 0; i < 2; i++)
{
if (dist[i] < nearestDist)
{
nearestDist = dist[i];
nearestId = i;
}
}
// Trick needed to avoid access violation bug
Mandelbulb1_SO Mandelbulb1_so;
Mandelbulb1_so.color.x = 0.0f;
Mandelbulb2_SO Mandelbulb2_so;
Mandelbulb2_so.color.x = 0.0f;
switch (nearestId)
{
case 0:
Mandelbulb1_so = Mandelbulb1_Shader(in, Mandelbulb1_oo, params);
out.color = Mandelbulb1_so.color;
break;
case 1:
Mandelbulb2_so = Mandelbulb2_Shader(in, Mandelbulb2_oo, params);
out.color = Mandelbulb2_so.color;
break;
default:
out.color = (float4)(0.0f, 0.0f, 0.0f, 0.0f);
break;
}
return out;
}
I imagined that the switch construct can cause the problem, so i tried with simple if’s:
TracerOut Trace(CameraOut in, SceneParams params)
{
//...
Mandelbulb1_SO Mandelbulb1_so;
Mandelbulb1_so.color.x = 0.0f;
Mandelbulb2_SO Mandelbulb2_so;
Mandelbulb2_so.color.x = 0.0f;
out.color = (float4)(0.0f, 0.0f, 0.0f, 0.0f);
Mandelbulb1_so = Mandelbulb1_Shader(in, Mandelbulb1_oo, params);
Mandelbulb2_so = Mandelbulb2_Shader(in, Mandelbulb2_oo, params);
if (nearestId == 0)
out.color = Mandelbulb1_so.color;
if (nearestId == 1)
out.color = Mandelbulb2_so.color;
return out;
}
And I still have the same problem. Removing one or both the if’s solves the problem:
TracerOut Trace(CameraOut in, SceneParams params)
{
//...
out.color = (float4)(0.0f, 0.0f, 0.0f, 0.0f);
Mandelbulb1_so = Mandelbulb1_Shader(in, Mandelbulb1_oo, params);
Mandelbulb2_so = Mandelbulb2_Shader(in, Mandelbulb2_oo, params);
out.color = Mandelbulb1_so.color;
if (nearestId == 1)
out.color = Mandelbulb2_so.color;
return out;
}
But of course this is not what I want. I know that conditionals are very bad for GPUs, but at the moment I don’t have other solutions, optimization will come later. This should be supposed to work so I believe this is a bug in the NVIDIA OpenCL driver, right? Anyone had similar problem? Any fix coming?
Thank you,
Mattia.
EDIT: My PC specs: Intel 2600K / NVIDIA GTX680 320.18/ Win8 64bit
P.S. There is no section dedicated to OpenCL, I hope it is included in the CUDA section…