Cuda Prefix Scan

Hi all,

I’m noob at Cuda Parallel Programming.I’m using prefix scan algorithm for integral image but that should be transformed does not change. I think I do wrong when I call kernel function. When I call kernel function my dimblock is 16 my dimgrid is
dim3 dimGrid((int)ceil(height / dimBlock.x), (int)ceil(width / dimBlock.y))

Thanks in advance.

thrust library (contained in cuda toolkit) provides pefix sum functions, see thrust: Prefix Sums

Other than Thrust library, you can look into the cuda SDK sample code. It has scan implementations [url]http://developer.download.nvidia.com/compute/cuda/1.1-Beta/x86_website/samples.html[/url]. Other libraries also include scan operation. [url]https://nvlabs.github.io/cub/index.html[/url] [url]http://cudpp.github.io/cudpp/2.2/index.html[/url]