I am currently starting to get back into GPU computing and need to solve a problem where I have to do parallel calculations on image sequences.
For the calculation I need to access the color values for a range of images and do statistics on those.
Since all my kernels need to access a couple of picture, what’s the best way to access these? Would all of the pictures be copied to memory on the card? What are my options to access the color values? Are there already methods to access those with sub-pixel accuracy (using linear interpolation in hardware)?
Any advice on how to start working on a problem like this would be greatly appreciated. Also pointers to Tutorials etc. concerning similar problems would be great!
You would allocate the images as arrays, memcpy the data from the CPU to the arrays on the device, then bind them as textures, then sample them in the kernel with tex2D (which yes will be sub-pixel and (optionally) bilinear filtered). There are a ton of samples which do this sort of thing, search the samples directory for e.g. “cudaBindTexture”.
I’d also take a look at the NVidia Performance Primitives library (https://developer.nvidia.com/npp) which has some optimised functions for common operations like histogram.
You might also consider using OpenCV (http://opencv.org/) rather than raw CUDA, it’s easier to work with and has a very large number of built-in functions from primitives like add/subtract through to highly complex image processing operations. You may find what you want here, or can build something by composing simpler ops.