data packing for image processing

Is it worth having the data in an image (only luminance information, one byte/pixel), packed like 4 pixels/ unsigned int?
Then, when using the data, unpack using bit operations. Is the gain due to less texture access enough, compared to the computation overhead introduced by the bit operations?

The algorithm requires about 16 or more consecutive(x axis) pixels at once, and the starting pixel address is not a multiple of 4.