Hello all, My first post!

First a quick run down of what im doing:
Im creating a real time path tracer, im currently using a large portion of the smallPT code by Kevin Beason(2008).
Im also using Open GL for the graphics API and OpenCL(obviously) for the computing performance.

Ive ported the code from linux compatible to windows and got it running inside an OpenGL window, it runs at 3fps! but thats only if i set the window size to 40 x 30.

Ok my problem is im trying to figure out how I would go about taking a radiance function that works in Host code and convert it to solely run on the gpu.
My reasons for this is the demanding billion or so calls required from the radiance function(depending on screen size and sample count).

Any input is much appreciated,

Many thanks

Rob

Note: ‘Vec’,‘Ray’ and ‘Sphere’ are variable types defined in this program. Xi holds the seed for a random generator.

Vec radiance(const Ray &r, int depth, unsigned short *Xi)
{
double t; // distance to intersection
int id=0; // id of intersected object

``````  if (!intersect(r, t, id)) return Vec(255,255,255); // if miss, return black
const Sphere &obj = spheres[id];        // the hit object
Vec x=r.o+r.d*t, n=(x-obj.p).norm(), nl=n.dot(r.d)<0?n:n*-1, f=obj.c;
double p = f.x>f.y && f.x>f.z ? f.x : f.y>f.z ? f.y : f.z; // max refl

if (++depth>5) if (erand48(Xi)<p) f=f*(1/p); else return obj.e; //R.R.

if (obj.refl == DIFF)// Ideal DIFFUSE reflection
{
double r1=2*M_PI*erand48(Xi), r2=erand48(Xi), r2s=sqrt(r2);
Vec w=nl, u=((fabs(w.x)>.1?Vec(0,1):Vec(1))%w).norm(), v=w%u;
Vec d = (u*cos(r1)*r2s + v*sin(r1)*r2s + w*sqrt(1-r2)).norm();
}

else if (obj.refl == SPEC)            // Ideal SPECULAR reflection

Ray reflRay(x, r.d-n*2*n.dot(r.d));     // Ideal dielectric REFRACTION
bool into = n.dot(nl)>0;                // Ray from outside going in?
double nc=1, nt=1.5, nnt=into?nc/nt:nt/nc, ddn=r.d.dot(nl);

double cos2t=1-nnt*nnt*(1-ddn*ddn);
if (cos2t<0)    // Total internal reflection

Vec tdir = (r.d*nnt - n*((into?1:-1)*(ddn*nnt+sqrt(cos2t)))).norm();
double a=nt-nc, b=nt+nc, R0=a*a/(b*b), c = 1-(into?-ddn:tdir.dot(n));
double Re=R0+(1-R0)*c*c*c*c*c,Tr=1-Re,P=.25+.5*Re,RP=Re/P,TP=Tr/(1-P);

return obj.e + f.mult(depth>2 ? (erand48(Xi)<P ?   // Russian roulette
``````

*TP) :
*Tr);
}

first thing jumps out is looks like radiance() calls radiance in the last line. Recursion is no allowed in OpenCL. Sorry, but that might mean you need to trace outside of Kevin Beason’s lines if you wish to pull this off.

Ok cheers for the heads up, also i found that the types defined used a C++ style struct method, If i remember right OpenCL and CUDA both use C so ive got to change the code there too.

Actually, in this case it’s quite easy to convert the recursion into a simple while loop. I did my own CUDA port of SmallPT a while ago, and this guy has also done an OpenCL port:

http://davibu.interfree.it/opencl/smallptgpu/smallptGPU.html

Ok unfortunately i didnt realise that a port had already been made, I think the next step to try and add my own element to it then would be to continue with this current port using kevin original work, davids work on the opencl port and add functionality to the path tracer by allowing for obj import and intersection.

Rob, you may want to check http://www.luxrender.net/wiki/index.php?title=SLG

It is somewhat a continuation of the work I did on SmallPtGPU. It includes support for scene import from modeller (i.e. Blender, 3DS max, etc.), material, etc. However SLG is a hybrid rendered (CPU+GPU), not GPU-only as SmallPtGPU.

A demo video of SLG is available here: http://vimeo.com/10974423