Cuda Prefix Scan

Hi all,

I’m noob at Cuda Parallel Programming.I’m using prefix scan algorithm for integral image but that should be transformed does not change. I think I do wrong when I call kernel function. When I call kernel function my dimblock is 16 my dimgrid is
dim3 dimGrid((int)ceil(height / dimBlock.x), (int)ceil(width / dimBlock.y))

Thanks in advance.

thrust library (contained in cuda toolkit) provides pefix sum functions, see https://thrust.github.io/doc/group__prefixsums.html

Other than Thrust library, you can look into the cuda SDK sample code. It has scan implementations http://developer.download.nvidia.com/compute/cuda/1.1-Beta/x86_website/samples.html. Other libraries also include scan operation. https://nvlabs.github.io/cub/index.html http://cudpp.github.io/cudpp/2.2/index.html