Converting RGB to RGBA

e.ping · February 7, 2008, 6:30pm

I’m doing various image processing via cuda, but the capture device outputs RGB. Due to memory alignment issues it’s much faster for cuda programs to deal with RGBA.

The naive way to convert from 24 to 32 bits is just copy and pad each texel. The problem is it will cause badly uncoalesced reads from global memory. 3 is never a multiple of a power of two so I can’t think of a way to get around this. Wondering if anyone can think of a clever trick?

Maybe i could have each thread read 4 texels worth of data (12 bytes) and write out 16 bytes, but skip the 5th texel read, and leave a hole in the destination data.

So during the first pass, you always read 1-12 bytes, 16-28 bytes, etc. This would guarantee coalesced reads from global memory (I think, haven’t checked my math yet)

Then do another pass and deal with the remaining texel. The second pass would be uncoalesced but since I would only be processing a 1/4 of the image it should over all be considerably faster.

mfatica · February 7, 2008, 6:36pm

You can use shared memory. look at slide 19 of the optimization presentation:

[url=“http://www.gpgpu.org/sc2007/SC07_CUDA_5_Optimization_Harris.pdf”]http://www.gpgpu.org/sc2007/SC07_CUDA_5_Op...tion_Harris.pdf[/url]

Topic		Replies	Views
question concerning data alignment CUDA Programming and Performance	8	4303	January 7, 2008
Color Image Processing Efficient 24bit RGB image access CUDA Programming and Performance	1	3964	November 26, 2009
Best access patterns for 8bit data on Compute 1.0/1.1 hardware CUDA Programming and Performance	3	4862	January 26, 2009
Help me about coalescing my program run too slow CUDA Programming and Performance	5	2939	May 14, 2008
Memory access coalescing Vs. the compiler CUDA Programming and Performance	2	2457	July 23, 2007
Memory coalescing and multiple arrays CUDA Programming and Performance	23	11784	March 20, 2009
global memory coalescing data accessing problem CUDA Programming and Performance	0	1079	July 31, 2008
char global memory access optimization CUDA Programming and Performance	17	11908	May 31, 2010
Color image processing on CUDA: RGB vs RGBA. What is faster? CUDA Programming and Performance	2	4329	September 8, 2010
horizontal lines after RGBtoRGBA and RGBAtoRGB in place algorigthm CUDA Programming and Performance	5	16791	December 23, 2010

Converting RGB to RGBA

Related topics