nppiErode_8u_C1R from NPP library

Hello,

I was wondering if someone could point me out how to use the nppiErode_8u_C1R function from the NPP library. This is what I have tried with no success:

// Used to store the rasters in the device

	Npp8u* d_rasterGrayscale;

	Npp8u* d_erotionResultGrayscale;

	cudaMalloc((void**) &d_rasterGrayscale, sizeof(uint8) * width * height);

	cudaMalloc((void**) &d_erotionResultGrayscale, sizeof(uint8) * width * height);

	

	// Apply erotion from the NPP library

	

	cudaMemcpy(d_rasterGrayscale, rasterGrayscale, sizeof(uint8) * width * height, cudaMemcpyHostToDevice);

	uint8 h_mask[] = {1,1,1,1,1,1,1,1,1};

	uint8* d_mask;

	cudaMalloc((void**) &d_mask, sizeof(uint8) * 9);

	cudaMemcpy(d_mask, h_mask, 9, cudaMemcpyHostToDevice);

	NppiSize roi;   // Apply erotion to the whole image

	roi.width = width;

	roi.height = height;

	NppiSize maskSize;

	maskSize.width = 3;

	maskSize.height = 3;

	NppiPoint anchor;

	anchor.x = 1;

	anchor.y = 1;

	NppStatus erotionStatus;

	

	erotionStatus = nppiErode_8u_C1R(d_rasterGrayscale, width, d_erotionResultGrayscale, width, roi, d_mask, maskSize, anchor);

	cudaMemcpy(erotionResultGrayscale, d_erotionResultGrayscale, sizeof(uint8) * width * height, cudaMemcpyDeviceToHost);

	

	// Free the memory on the device

	cudaFree(d_rasterGrayscale);

	cudaFree(d_erotionResultGrayscale);

	cudaFree(d_mask);

The erotionStatus reports: NPP_TEXTURE_BIND_ERROR

Any advices would be greatly appreciated.

Thanks,

Cristobal

Hi Cristobal,

at first glance, I would say you need to fix your ROI and offsets. The erode function with a mask size of 3x3 can only produce images that are two pixels smaller than the input image on either side. NPP functions to not automatically replicated pixel values when the algorithm would read pixels outside of the input image.

In your particular case, you would need to set

roi.width  = width  - 2; // generally: roi.width  = image.width  - kernel.width  + 1;

roi.height = height - 2; // generally: roi.height = image.height - kernel.height + 1;

Because you are also moving the mask to be centered over the input pixel, you would need to offset the input image to start at pixel (1, 1) instead of pixel (0,0), which would be achieved by passing

nppiErode_8u_C1R(d_rasterGrayscale + width /* more generally line step */ + 1, ...)

Also, unless your image width is a multiple of 64, you are likely to loose a huge amount of performance by not using the CUDA or NPP 2D malloc functions that will start each line of the image optimally aligned for maximum performance.

–Frank

This doesn’t work for me?

int pitch=0, pitcherod=0;
d_label=nppiMalloc_8u_C1(width, height, &pitch);
derod_label=nppiMalloc_8u_C1(width, height, &pitcherod);

//NPP Erode/Dilate
NppiSize oMaskSize = {3, 3};
// mask... is this the correct way to initialize the mask??
Npp8u mask[9] = {0,1,0,
				1,1,1,
				0,1,0};
	// set anchor point inside the mask to (1, 1)
NppiPoint oAnchor = {2, 2};
NppiSize oSizeROI = {width - oMaskSize.width +1, height - oMaskSize.height +1};

	//// run erode
	NppStatus eStatusNPP;
	//eStatusNPP =  nppiErode_8u_C1R(d_label + width + 1, width,  derod_label, pitcherod, roi, mask, maskSize, anchor);
	eStatusNPP = nppiErode_8u_C1R(d_label+pitch+1, pitch, derod_label, pitcherod, oSizeROI, mask, oMaskSize, oAnchor);
	printf("Erode error: %i \n", eStatusNPP);
	eStatusNPP = nppiDilate_8u_C1R(d_label+pitch+1, pitch, derod_label, pitcherod, oSizeROI, mask, oMaskSize, oAnchor);
	printf("Dilate error: %i \n", eStatusNPP);

Two things caught my eye:

  • I assume you want the mask to be centered. You'd have to set the oAnchor = {1, 1} for that. Indexing in NPP is the same as C, i.e. starting at 0.
  • The mask data must reside on the device, i.e. you must allocate a device-pointer and copy the mask to the device.

Hi Frank,

I got the 3x3 mask to work, but when I increase the mask to 5x5 it doesn’t work?

NppiSize oMaskSize = {5, 5};
Npp8u mask[25] = {0,0,1,0,0,
					0,1,1,1,0,
					1,1,1,1,1,
					0,1,1,1,0,
					0,0,1,0,0};
NppiPoint oAnchor = {(oMaskSize.width-1)/2, (oMaskSize.height-1)/2};
	NppiSize oSizeROI = {width - oMaskSize.width + (oMaskSize.width-1)/2, height - oMaskSize.height + (oMaskSize.height-1)/2};

NppStatus eStatusNPP;
		eStatusNPP = nppiErode_8u_C1R(d_label+pitchu+((oMaskSize.width-1)/2), pitchu, d_label, pitchu, oSizeROI, d_mask, oMaskSize, oAnchor);
		eStatusNPP = nppiDilate_8u_C1R(d_label+pitchu+((oMaskSize.height-1)/2), pitchu, d_label, pitchu, oSizeROI, d_mask, oMaskSize, oAnchor);