nppiFilterGaussBorder_8u_C1R oSrcSize and oSrcOffset parameters

Hi, I would like to run nppiFilterGaussBorder_8u_C1R just on a part of an image.
The documentation says the following:

pSrc: Source-Image Pointer.
oSrcSize: Source image width and height in pixels relative to pSrc.
oSrcOffset: The pixel offset that pSrc points to relative to the origin of the source image.

Based on this my understanding is:
If I have a 10x10 image, and I would like to apply nppiFilterGaussBorder_8u_C1R on the TopLeft{x:2, y:2} BottomRight{x:8, y:8} area

  • pSrc should be image.ptr + 2 * 10 + 2 [an offseted pointer that points to the second row second column]
  • oSrcSize should be {10-2, 10-2} [Source image width and height in pixels relative to pSrc]
  • oSrcOffset should be {2, 2} [The pixel offset that pSrc points to relative to the origin of the source image.]
  • oSizeROI should be {8-2,8-2}

But if I calculate my parameters like this I get NPP_OUT_OFF_RANGE_ERROR in some cases (when the rect is on the bottom of the image e.g.). So most probably my understanding is not correct.

Could you please help me how I should set pSrc, oSrcSize and oSrcOffset in the mentioned example?

based on your SO posting, and according to my testing, you will get expected output if you pass the actual original image dimensions for oSrcSize. Here is my example:

$ cat t18.cu
#include <npp.h>
#include <nppi.h>
#include <iostream>
const int sz = 32;
int main() {
    Npp8u* src_img = new Npp8u[sz*sz];
    for (int i = 0; i < sz*sz; i++) src_img[i] = (i&1)?80:40;
    Npp8u* cudaMem = nppsMalloc_8u(sz * sz);
    Npp8u* cudaMemDst = nppsMalloc_8u(sz * sz);

    if(cudaMem == nullptr)
    {
        throw std::runtime_error("Error malloc");
    }
    cudaMemcpy(cudaMem, src_img, sz*sz, cudaMemcpyHostToDevice);
    if(cudaMemDst == nullptr)
    {
        throw std::runtime_error("Error malloc dst");
    }
    cudaMemset(cudaMemDst, 0, sz*sz);
    NppiPoint const blurTopLeft{16, 16};       //inclusive
    NppiPoint const blurBottomRight{sz, sz}; //exclusive

    //Source image width and height in pixels relative to pSrc.
//    NppiSize const oSrcSize{sz - blurTopLeft.x, sz - blurTopLeft.y};
    NppiSize const oSrcSize{sz, sz};
    NppiSize const oSizeROI{blurBottomRight.x - blurTopLeft.x, blurBottomRight.y - blurTopLeft.y};

    auto const nppiError{nppiFilterGaussBorder_8u_C1R(
        cudaMem + blurTopLeft.y * sz + blurTopLeft.x,
        sz,            // pitch
        oSrcSize,       // source image width and height in pixels relative to pSrc.
        blurTopLeft,    // aka. oSrcOffset: The pixel offset that pSrc points to relative to the origin of the source image.
        cudaMemDst + blurTopLeft.y * sz + blurTopLeft.x,
        sz,            // dst pitch
        oSizeROI,
        NPP_MASK_SIZE_3_X_3,
        NPP_BORDER_REPLICATE
    )};

    if(nppiError != 0)
    {
        // We get NPP_OUT_OFF_RANGE_ERROR
        std::cerr << "nppiFilterGaussBorder_8u_C1R failed: " << nppiError << std::endl;
    }
    else
    {
        cudaMemcpy(src_img, cudaMemDst, sz*sz, cudaMemcpyDeviceToHost);
        for (int i = 0; i< sz; i++){
          for (int j = 0; j < sz; j++)
            std::cout << (int)(src_img[i*sz+j]) << " ";
          std::cout << std::endl;}
    }
    nppsFree(cudaMem);
    nppsFree(cudaMemDst);
}
$ nvcc -o t18 t18.cu -lnpps -lnppif
$ compute-sanitizer ./t18
========= COMPUTE-SANITIZER
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 61 58 61 58 61 58 61 58 61 58 61 58 61 58 61 69
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 61 58 61 58 61 58 61 58 61 58 61 58 61 58 61 69
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 61 58 61 58 61 58 61 58 61 58 61 58 61 58 61 69
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 61 58 61 58 61 58 61 58 61 58 61 58 61 58 61 69
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 61 58 61 58 61 58 61 58 61 58 61 58 61 58 61 69
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 61 58 61 58 61 58 61 58 61 58 61 58 61 58 61 69
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 61 58 61 58 61 58 61 58 61 58 61 58 61 58 61 69
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 61 58 61 58 61 58 61 58 61 58 61 58 61 58 61 69
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 61 58 61 58 61 58 61 58 61 58 61 58 61 58 61 69
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 61 58 61 58 61 58 61 58 61 58 61 58 61 58 61 69
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 61 58 61 58 61 58 61 58 61 58 61 58 61 58 61 69
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 61 58 61 58 61 58 61 58 61 58 61 58 61 58 61 69
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 61 58 61 58 61 58 61 58 61 58 61 58 61 58 61 69
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 61 58 61 58 61 58 61 58 61 58 61 58 61 58 61 69
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 61 58 61 58 61 58 61 58 61 58 61 58 61 58 61 69
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 61 58 61 58 61 58 61 58 61 58 61 58 61 58 61 69
========= ERROR SUMMARY: 0 errors
$

No, I don’t have answers for all the questions that engenders. I have found NPP parameter handling to be inscrutable at times. You’re welcome to file a bug to request doc clarification.

1 Like

@Robert_Crovella thank you for checking the issue. Let’s continue here as the SO thread is closed.

As I wrote there I’m not sure if it is the right way to pass the oSrcSize{256, 256}, as nppiFilterGaussBorder_8u_C1R might sample outside the image this way. If it starts from 224th row and thinks that there is 255 more rows in the input and calculates the last row of the output image (y==255), then the function doesn’t know that it shouldn’t sample from input’s 256th and 257th row as they are out of bounds. But seemingly it works based on your output. I file a bug as the documentation is misleading and there is no way to figure out the exact way it expects the actual parameters.