Layered depth of field based on Gaussian Blur

Hey,

here’s an attempt at generating a real time (or at least CUDA accelerated) depth of field effect.

[attachment=21708:DOF.jpg]

Games usually do this in pixel shaders when realtime performance is needed. I decided to use CUDA and based this on the constant-time gaussian blur implementation from the CUDA SDK. I create lots of differently blurred layers and alpha-blend them together in back to front order. Performance is just so-so (real time achieved only on the “High end” cards), but I am actually going more for quality than I am shooting for real time. This project is extremely bandwidth limited due to the many filter passes.

My intention was to accelerate a depth-of-field postprocessor for the 3D fractal generator called “Mandelbulber”. But this effect is not yet good enough compare favorably with the original CPU based implementation. In particular the gaussian filter kernel does not quite look good enough in very blurred sections. It would be better to use a disc shaped convolution kernel to mimick an “Airy disc” or “circle of confusion” (look up on Wikipedia if needed). The graphics file I included is a scene rendered in Mandelbulber (one of the default scenes dumped to ASCII). To see more stunning Mandelbulber screen shots look here: http://www.fractalforums.com/images-showca…-of-field-test/

Attached .zip archive contains the Visual C++ project and a Win32 executable. It replaces the recursiveGaussian SDK project folder (so create a backup if you still need that one). To compile on Linux, add the Makefile from the original recursiveGaussian project.

ATTACHED FILE IS ACTUALLY A 7-ZIP (.7z) ARCHIVE. MUST USE 7-ZIP TOOL TO DECOMPRESS.

Most important keys:
Arrow Down/Up moves the plane of focus
Arrow Left/Right adjusts the effect strength
‘s’ toggles a sweep through the layers

Enjoy.

Edit: A speedup might be achieved by generating several blur layers in one pass from the source image (different threads work with different filter coefficients) and by using shared memory to permit the 2nd filter pass to directly filter the columns, saving the two transpose passes in the filtering process (or in different terms: performing the transpose on the fly)

Apparently the forum software is broken and can’t serve large files properly. I only get 2.3 MB out of 6.5 MB when I try to download this attachment.

So I have put this archive on a share hosting service. Alternate download location is here:
http://rapidshare.com/files/379768089/recu…aussian.7z.html

Hey,
I’m very interested in your above project, but the download link was invalid, will you please send me one ? My email is xw20060830123@163.com. Great appreciate for your early reply. Thank you in advance.
Best wishes
Jaze