Downsampling image (using texture processor?)

I’d like to downsample an image by a factor 2 (in each direction) using CUDA. What is the fastest way to do this? Can the texture processor help?

The way I would do it on the CPU is to blur the image by an 5x5 kernel and then pick every 2nd pixel.