NPP Library Image Processing (erosion + dilation) seems non-deterministic

ADGB · June 7, 2017, 3:15pm

Hello.

I am using the NVIDIA Performance Primitives to perform some image processing on a Jetson TX1 with CUDA 8.

I just want to use a simple function like ‘nppiDilate_8u_C1R’. The issue is that the output image is different each time I apply the function while the input image is the same.

I wrote a short program to prove it (compiled with ‘nvcc prog.cu -lnppi -arch=sm_53’):

#include <iostream>
#include <npp.h>

int main(void) {    
    const int width = 2592;
    const int height = 1944;
    const int nbPixels = width * height;

    unsigned char img[nbPixels];

    for (int i = 0; i < nbPixels; i++) img[i] = i % 10 == 0 ? 255 : 0;

    unsigned char *d_img;
    unsigned char *d_mask;

    NppiSize maskSize = {3, 3};
    Npp8u mask[9] = {
        1, 1, 1,
        1, 1, 1,
        1, 1, 1
    };
    NppiPoint anchor = {1, 1};
    NppiSize sizeROI = {width, height};

    cudaMalloc(reinterpret_cast<void **>(&d_img), nbPixels * sizeof(unsigned char));
    cudaMalloc(reinterpret_cast<void **>(&d_mask), sizeof(unsigned char) * maskSize.height * maskSize.width);
    cudaMemcpy(d_mask, mask, maskSize.height * maskSize.width, cudaMemcpyHostToDevice);
    cudaMemcpy(d_img, img, nbPixels * sizeof(unsigned char), cudaMemcpyHostToDevice);

    nppiDilate_8u_C1R(d_img, width, d_img, width, sizeROI, d_mask, maskSize, anchor);

    cudaMemcpy(img, d_img, width * height * sizeof(unsigned char), cudaMemcpyDeviceToHost);

    int count = 0;
    for (int i = 0; i < nbPixels; i++) count += img[i] / 255;
    std::cout << count << std::endl;

    return 0;
}

The number of white pixels can be 3 830 585 or 3 859 498 for example.

I find this strange that it is non-deterministic and that there is a random component acting for a simple dilatation operation.

Is it the expected behavior or am I doing something wrong?

njuffa · June 7, 2017, 3:46pm

I have never used NPP. But from reading along for many years, a common programmer error with the use of NPP functions seems an incorrectly defined region of interest, which in turn leads to out-of-bounds memory accesses, which could explain “random” output (that are caused by data outside the image). So I would suggest double-checking ROI.

BulatZiganshin · June 7, 2017, 6:39pm

i have no idea that is dilatation, but if this compuattion involves floatimng-point data, the reason may be combination of

FP computation results may depend on their order, they don’t comply to math laws
order of grid blocks execution in the kernel is undefined

ADGB · June 8, 2017, 12:13pm

Thank you both for the answers.

I was doing something very very bad and stupid. This is obvious now that I carefully read my code: I tried to apply the image processing operation “in-place”. Source and destination images were the same which obviously leads to weird results.

Topic		Replies	Views
NPP; Morphological Operations; Dilate operation gives strange results GPU-Accelerated Libraries npp	3	254	May 22, 2024
Problem with nppi morphological operation GPU-Accelerated Libraries	3	1098	August 7, 2018
NPP Integral Image CUDA Programming and Performance	5	3347	June 22, 2011
Problem with nppiFilter_8u_C1R getting a black output CUDA Programming and Performance	6	7685	August 25, 2011
NPP - nppiFilter_8u_C1R returns KERNEL_EXECUTION Debug options? CUDA Programming and Performance	6	6749	April 25, 2010
Cuda - NPP Developers.. CUDA Programming and Performance	10	5676	August 2, 2010
CUDA 4.0 NPP giving wrong answers : NPP bug possibly ? CUDA Programming and Performance	3	1362	April 20, 2011
The NPPIRESIZE function does not output a value in different cases GPU-Accelerated Libraries npp	3	660	August 5, 2023
Nvidia Primitive functions for rotating an image Jetson TX2 cuda	4	1084	October 18, 2021
Using Nvidia NPP to resize image CUDA Programming and Performance	2	2586	March 22, 2018

NPP Library Image Processing (erosion + dilation) seems non-deterministic

Related topics