NPP Box Filtering using NPP


I was testing NPP boxFilter.cpp. I replaced the original image load and copy to GPU memory steps with cudaMalloc pointer that contained my image data of size 512x512 uint8 type. After that , I allocated memory for result using the formula – rows/cols - mask.width/height + 1 using cudaMalloc (uint8 type).

The pitch in this case for the Source and Destination images would be equal to their first dimensions , ie. , rows of Src Image (equal to 512) and rows of Destination Image(508 for mask size 5 using the above formula for full image filtering).

This code runs fine — but when I replace 5x5 mask with a 3x3 or 7x7 mask – I get NPP error -1. The documentation doesn’t describe the requirements of NPP functions in terms of Image Dimensions. Need help regarding this – I have data as uint8 type on device and I’m interested in doing some Image processing stuff on this data — but NPP is giving errors.

Could someone also help me with the concept of Anchor pointer – what is the valid range (is it {-1,-1} to {1,1}) and what is the centre point ({0,0}??). And how this effects image processing.

I’m using NPP v1.0 with CUDA toolkit 2.3.

I also tried replacing the ROI for ‘Lena.pgm’ in original boxFilter.cpp and it seems to me that the reference point is bottom-left – as on increasing the ROI in steps – the processed image was growing from bottom-left in place of top-left. But , I always thought the device pointer returned by NPP image class was pointing to image top-left pixel. This is what I would have for the data I want to process using NPP functions. It points to the first element of the matrix.


code.tar.gz (1.31 KB)

Hi Jaideep.

You found a bug in NPP 1.0. The BoxFilter primitive returns an error because it requires a line-stride that is a multiple of four. For a workaround I would recommend to use the NPP pitched 2D allocators or the CUDA runtime’s 2D allocators. Either will provide properly aligned memory.