Non-separable filtering in 2, 3 and 4 dimensions

Since the NPP library only supports 2D convolution for integers, and the CUDA SDK only includes examples of separable convolution, I have made my own library for non-separable convolution in 2, 3 and 4 dimensions. A book chapter about the implementations will be published in the book “GPU Pro 5”. Enjoy!