UPDATED: I replaced the texture bind/unbind code with a call to cudaThreadSynchronize(). I also removed the unneeded double precision math functions from the CPU version of the code and converted it to double precision.
UPDATED: There was a bug in the high precision multiply. Either the G80 has a precision issue in the multiply-add instruction or I did not translate the original fortran code correctly. I used an alternate version of dsmul code to solve the problem. You can now zoom in much further than before.
UPDATED: I implemented double precision math functions to increase the maximum useful zoom factor. You will notice a big drop in the frame rate when the double precision math kicks in.
UPDATED: I unrolled the Mandelbrot loop and got a 17% speed increase.
Hi all. I am just starting CUDA programming and wrote a simple Mandelbrot program as part of my learning process. I am submitting the project files for anyone to play around with. I did a timing test and found that it runs 85.5 times faster on an 8800 GTX GPU than an AMD Opteron 2GHZ processor.
I have attached the project files. You can build them unzip the folder into the CUDA SDK projects folder. The program renders the Mandebrot set at about 60 frames per second and uses adaptive samping to anti-alias the image. When not animated, it will perform 128 passes of full frame anti-aliasing. You can randomize the color palette with the ‘c’ or ‘C’ keys and animate the colors with the ‘a’ or ‘A’ keys. Scrolling and zooming is performed with the mouse left and right buttons respectively while dragging. You can also use the ‘d’ and ‘D’ keys to increase or decrease detail.