Starting a compression algorithm powered by CUDA any help is appreciated

FireBeast · April 7, 2010, 8:29pm

Hello Guys!

I’m doing the final essay of my CS degree on an algorithm that does compression using the VGA. For that I chose CUDA.

I’m hopping that I can do something nice by the end of the semester. Although I have NO experience on CUDA and a really weak C/C++ skill. (I’m a java developer)

I’m studying ways to do it. I’ve found pigz and I intend to use it as a guide to how I should do it.

If you have an idea of how this algorithm should be implemented please reply.

cbuchner1 · April 7, 2010, 8:47pm

I’d do it block wise, every multiprocessor crunching on an independent block of input data. Encryption with CUDA would have been way easier, if you ask me. In compression the data dependency leads to lots of branching which kills performance (taking divergent paths within warps is bad). And CUDA isn’t so good with 32 bit integer operations (integer ALU is 24 bit wide). It excels more at floating point. But if you consider lossy data compression algorithms, CUDA may be better suited. These algorithms typically do some signal processing before throwing data away (FFT, DCT, Wavelet, MDCT, you name it).

cbuchner1 · April 7, 2010, 8:51pm

sorry, accidential post … deleting

Gregory_Diamos · April 8, 2010, 12:17am

I would really like to see some more research into this area. Lossless compression seems to be one of the few areas in which cuda falls flat on its face. Is anyone aware of any work into compression/decompression algorithms that can be done in parallel (I am pretty sure that common gzip/bzip/z7 are incredibly hard). If these don’t exist, it might make sense to go back to the drawing board and start designing new algorithms…

CapJo · April 8, 2010, 8:30am

This page might be a good point to start http://www.ross.net/compression/introduction.html

Another interesting page is http://gnuwin32.sourceforge.net/packages/lzo.htm

On both pages the source code is available.

FireBeast · April 8, 2010, 12:59pm

Thanks for the replies.

The idea is to make a lose-less algorithm. In my idea I think it might be possible to do the compression at the same amount of time. I’m not sure if i’ll be able to do that. But if I can’t at least having an software that doesn’t uses the CPU while you are compressing/decompressing is always nice to have.

cbuchner1 thanks for the tips.

CapJo I’ll be reading the links you send me. Thanks

CapJo · April 8, 2010, 4:21pm

Here is another link to the same topic. IBM had already the idea to do compression in parallel and they are also using a Lempel-Ziv-Algorithm (like in the other links I posted).

http://www.research.ibm.com/servertechnolo…ParCompDem.html

cbuchner1 · April 9, 2010, 11:15am

Patent issues aside, arithmetic coding based on floats could be a worthwhile topic for investigation.
Traditionally CPU based algorithm used fixed point integer implementations (usually on 16 bit ints),
but on the GPU floats might be better performing.

The main challenge would be to emit the coded bits in a way that does not cause too many divergent
branches among threads (and use shared memory a lot to store the bits until you have gathered enough
bits to write them out to global memory in a coalesced way).

Christian

tera · April 9, 2010, 11:17am

I’m no expert in arithmetic coding. But how would you deal with rounding error?

cbuchner1 · April 9, 2010, 12:38pm

there are methods to use particular rounding modes in CUDA (not sure if it’s only available within PTX or if there are also compiler intrinsics). By choosing a suitable mode the behavior is known.
renormalization of the interval at known times can bring the numbers back into suitable ranges (we definitely need to stay away from denormals)

when compression and decompression are both happening on CUDA devices, no problems should occur. If the CPU will decompress, then we need to make sure the rounding on the CPU works the same as on the CUDA device.

It’s been a long time since I’ve last played with arithmetic coding myself.

Topic		Replies	Views
Data Compression on GeForce CUDA Programming and Performance	9	9164	May 5, 2012
Ultimate Data Compression Algorithm PAQ on CUDA CUDA Programming and Performance	8	5413	December 9, 2008
winrar, winzip or 7zip on GPU like the topic says CUDA Programming and Performance	55	154941	May 29, 2011
About PNG Compression On CUDA CUDA Programming and Performance	1	3613	November 4, 2016
LZO realtime compression LZO realtime compression. CUDA Programming and Performance	4	7602	October 27, 2008
LZMA text compression over CUDA CUDA Programming and Performance	1	4640	May 15, 2011
Accelerating PNG-Compression using CUDA possible? CUDA Programming and Performance	7	13835	November 18, 2007
image compression with CUDA for multiple sources in parallel CUDA Programming and Performance	1	9737	November 2, 2008
Data compression in CUDA link to a researcher's page CUDA Programming and Performance	1	1101	March 30, 2012
LZW Data compression on CUDA CUDA Programming and Performance	7	3603	April 3, 2013

Starting a compression algorithm powered by CUDA any help is appreciated

Related topics