"Genetic" image compression using transparent polygons a hot candidate for an implementati

Ocire · January 7, 2009, 2:17pm

with a gtx260 on xp x64 i get 235 fps.
can it be that there are lots of host-device memCpys? because that’s the only thing that i can think of that would cause such a difference… (both my pcie-busses aren’t that good, old workstation hardware, 1.5GB/s)

cbuchner1 · January 7, 2009, 2:32pm

I am doing a lot of Mapping/Unmapping of OpenGL Pixel Buffer objects, which is known to have performance issues. My mainboard only has PCIE 4x support by the way so I am not the fastest one either.

UPDATE: to curb this bottleneck, I am now only mapping the PBO when the texture is going to be rendered to screen (i.e. every 10th frame). In the other frames I am not writing an output texture in my CUDA kernel, but I am still generating an error metric that could be used to select good mutations. I am now getting around 680 FPS.

Please ket me know if performance is now significantly better for you.

Christian

cbuchner1 · January 11, 2009, 4:45pm

I updated the code in the parent post to have an OpenGL based render path as well.

Use the ‘r’ key to switch renderers. Use ‘m’ to stop mutation in order to compare the output of both renderers. They are not pixel-exact (yet), but that hasn’t been the design goal.

OpenGL renderig seems to be about 3 times as fast as the CUDA counterpart initially, but frame rate drops as the mutant polygons grow bigger - we seem to hit fill rate limitations soon. It has been fun to figure out how to do this properly in OpenGL - fortunately disabling all RGB color range clamping was possible with the ARB_color_buffer_float extension. Before I found out about this, I had to do ping-pong rendering between two floating point buffers with fragment shaders, which was horribly slow.

When enabling the Box Filter on the OpenGL render path, I have to copy the rendered contents of the 32 bit floating point RGBA frame buffer object into a pixel buffer object that CUDA can access. This conversion seems slow and frame rate drops to about 400 for me (even slower than the CUDA renderer with filtering).

The main drawback of the OpenGL renderer is that it doesn’t give me an error metric for the mutation yet. How to do a reduction in a pixel shader? I don’t know. And transferring the image to CUDA for further analysis has some performance bottleneck, as stated above.

Christian

cbuchner1 · January 18, 2009, 10:11pm

A variation of this technique using transparent circular “blobs” rather than polygons. The initial goal was to emulate an “oil painting” style.

[url=“http://www.m3xbox.com/index.php?page=p_gpupainting”]http://www.m3xbox.com/index.php?page=p_gpupainting[/url]

Just follow the link. This implementation uses GPGPU techniques (not CUDA). The site has screenshots.

Christian

jack · January 20, 2009, 6:45pm

I was thinking about this contest today, and had another idea you might possibly add as a ‘judging factor’:

The original fitness function did a pixel-by-pixel comparison with the generated polygon image; however, this does not account for one of the chief benefits of vector drawing – infinite scaling without loss of resolution. You could add this test to the fitness function by doing some sort of interpolation/scaling on the original image (say, to increase it to 200% size), then multiply your polygons by a scaling vector/matrix to increase their size as well. Then, do the same pixel by pixel comparison. With a few iterations of this on some given scales (for the contest, e.g. 33%, 50%, 125%, 200%, 400%), you could determine a sort of ‘error derivative’ – a function that could tell how good/bad your code matches the scaled picture. Lower values = better.

Another way to do this without the scaling is to take some large (multipixel-size) pictures, where the original represents the largest size, then scale them down with Photoshop/gimp/etc. into the corresponding sizes…this takes away the weird results that might occur from upscaling a very small picture.

cbuchner1 · April 13, 2009, 6:44pm

More results on “Screaming Duck’s” Blog. He added a blur factor to the polygon and also created a binary format to investigate the kind of compression he could achieve. No use of CUDA though. This is quite an in-depth article tracing his steps.

[url=“Some Stuff - Screaming Duck Software”]Some Stuff - Screaming Duck Software

StarBP · May 3, 2009, 1:59am

Not much going on here… Could Mr. Alsing PLEASE publish his source code, or at least a Windows .exe binary? I am a student who may be doing an investigation on evolutionary algorithms soon.

jack · May 3, 2009, 12:31pm

[url=“Google Code Archive - Long-term storage for Google Code Project Hosting.”]Google Code Archive - Long-term storage for Google Code Project Hosting.

cbuchner1 · May 3, 2009, 10:37pm

This however is a quite old release (that takes several hours to achieve a good image). Roger Alsing’s blog reports about a multiprocessor release that spreads the workload to two CPUs and achieves the same quality in about 5 minutes. I think that’s what the previous poster wanted access to. Hint: the right person to ask would be Roger Alsing, not this forum ;)

StarBP · May 6, 2009, 1:13am

I added all of Dan Bystrom’s (http://danbystrom.se) speed suggestions, then I improved it further. http://starcalc.110mb.com/EvoLisa.zip is where you can find it.

cbuchner1 · May 21, 2009, 4:21pm

[url=“http://digg.com/d1riHp”]http://digg.com/d1riHp[/url]

Here’s an example of someone using a related technique to encode an image into 140 characters allowed by Twitter messages.
It’s just that the result looks anything but convincing yet ;)

StarBP · June 17, 2009, 12:50am

I have attached a port of Roger’s program to Direct3D. It achieves 4,000 Generations per Second with 50 5-sided polygons on a GeForce 8600 using the default Mona Lisa image. It is most likely CPU-limited on the same GPU, so 10K+/sec may very well be possible with a modest Quad-Core system (I tested on a single core of such a system). I haven’t implemented the Fitness Function yet, but it will be done soon (in HLSL using Nvidia’s FX Composer). Right now, it simply mutates x random polygons, where x is the number you specify. Setting the “Image Scale” above 1.0 results in slight cosmetic issues on some computers (it pixelates when it is supposed to blur).

cbuchner1 · June 17, 2009, 10:20am

Hmm. beats my 600 iterations/second for 127 triangles that I get in CUDA (however I generate the fitness value during rendering).

Just curious, how can you perform a parallel reduction in HLSL? Or are you intending to just generate the difference image (mean square error metric) per pixel and let the CPU generate the final sum?

Christian

yahastu · June 20, 2009, 3:25am

Patently obvious…how do you come to this wild conclusion?

Genetic algorithms are simply one class of algorithms for minimizing residual error. One of MANY other classes of algorithms. In fact GA’s are one of the most unfriendly to program and often quite inefficient compared to other methods for the majority of problems, and the theory about how to best choose mutation operators is very under developed leading to a great deal of uncertainty in what is the best way to use them.

So I’m quite curious…how you came to the conclusion that GA’s are better than other algorithms which can be used to perform the same task…such as simulated annealing, belief propagation, graph cuts, monte carlo sampling, levenberg marquardt, branch and bound, mean field annealing, particle swarm optimization, evolutionary search, etc…

StarBP · June 21, 2009, 3:10pm

IT IS FINISHED!!! (well, at least the alpha version is; there is still no file format for it) It gets around 1,200 FPS on a GeForce 8600 GT. If there was any way to get rid of all the state changes at every stage of the reduction (it even re-initializes samplers which aren’t even used for that section of the code according to PIX), that would be appreciated. I have attached the source (There is a binary in the bin/Release folder). It is released under the General Public License, simply meaning that all applications utilizing any part of this source code must reveal their source code, as well.

cbuchner1 · November 13, 2009, 11:28pm

A nice video of the “Evolisa” algorithm in action is found here:

[url=“http://www.brianlow.com/index.php/2009/01/26/evolisa-video/”]http://www.brianlow.com/index.php/2009/01/26/evolisa-video/[/url]

And here is the Sydney opera house rendered in SVG, as polygons generated by this algorithm. Worked fine in Firefox. [url=“http://www.conceptdevelopment.net/Wpf/EvoLisaViewer/operahouse_day.svg.xml”]http://www.conceptdevelopment.net/Wpf/EvoL...use_day.svg.xml[/url]

Now can we get this thread to over 30000 views or what.

Christian

cbuchner1 · November 18, 2010, 11:27am

Student project:

Evolisa done in PyCuda

http://cs264.org/projects/web/Ding_Yiyang/ding-robb/index.html

cbuchner1 · November 18, 2010, 11:27am

Student project:

Evolisa done in PyCuda

http://cs264.org/projects/web/Ding_Yiyang/ding-robb/index.html

jack · February 16, 2011, 11:46pm

There’s a big thread going on at StackOverflow about this algorithm: unicode - Twitter image encoding challenge - Stack Overflow

tugrul_192bit · October 16, 2019, 9:56am

Do those algorithms work on full resolution or a lowered one inside?

I made a CUDA accelerated one on 128x128 and it barely approached the original in nearly 10 minutes.

Topic		Replies	Views
Community Contest Genetic image compression Announcements	1	21809	January 6, 2009
What are you guys doing with cuda? just wanna find a way to go CUDA Programming and Performance	81	56807	February 7, 2013
Genetic Algorithm CUDA Programming and Performance	28	32708	September 16, 2021
winrar, winzip or 7zip on GPU like the topic says CUDA Programming and Performance	55	155422	May 29, 2011
Mandelbulbs Mandelbrot sets transformed into 3D CUDA Programming and Performance	38	55629	December 1, 2010
demosaicing raw image data nearest neighbor algorithm CUDA Programming and Performance	10	13500	May 2, 2010
building a cucv? computer vision CUDA Programming and Performance	27	20733	September 14, 2009
Is GPU worth it? GPU currently too slow. CUDA Programming and Performance	16	6176	December 8, 2008
Rasterization strategy CUDA Programming and Performance	14	9738	January 11, 2011
Is this a task in which CUDA could speed up things? CUDA Programming and Performance	11	11919	September 29, 2009

"Genetic" image compression using transparent polygons a hot candidate for an implementati

Related topics