Problem with nppiFilter_8u_C1R getting a black output

Donos · April 14, 2011, 1:23pm

Hello,

I’m working on an image processing program and I want to implement convolution via NPP.

I’m facing the problem that whatever I pass to the nppiFilter_8u_C1R function, I get a black output (even thought it is correctly sized).

Can someone help me ?

Here is the code (with little obfuscation because I’m not free to share it) :

customImageType image("My image.tif");

	float m [] ={0 , 1, 0, 1, -4, 1, 0, 1, 0};

	customImageType mask(m, 3, 3); //instanciation of the mask image

	

	image.renormalize(0,255); //renormalization and cast to unsigned int;

	//allocation of source image on CPU with NPP structure

	npp::ImageCPU_8u_C1 oHostSrc(image.width, image.height);

	memcpy(oHostSrc.data(), image.data, image.nbPoints*sizeof(unsigned char));

	//allocation of mask on GPU with CUDA

	mask.castToLong();

	Npp32s *deviceMask;

	cudaMalloc((void**)&deviceMask, mask.nbPoints()*sizeof(long));

	cudaMemcpy(deviceMask, mask.data, mask.nbPoints*sizeof(long), cudaMemcpyHostToDevice);

	

	NppiSize maskSize = {mask.width,mask.height};

	NppiSize ROI = {image.width - mask.width + 1, image.height - mask.height + 1};

	//allocation of the destination image on GPU

	npp::ImageNPP_8u_C1 oDeviceDst(ROI.width, ROI.height);

	//allocation of the destination image on CPU

	npp::ImageCPU_8u_C1 oHostDst(oDeviceDst.size());

	NppiPoint anchor = {0,0};	

	

	Npp32s* divisor = new Npp32s[1];

	divisor[0] = (Npp32s)mask.sum();

	Npp32s* deviceDivisor;

	cudaMalloc((void**) &deviceDivisor, sizeof(Npp32s));

	cudaMemcpy(deviceDivisor, divisor, sizeof(Npp32s), cudaMemcpyHostToDevice);

	

	//allocation of source image on GPU by copy of CPU image

	npp::ImageNPP_8u_C1 oDeviceSrc(oHostSrc);

	NppStatus ret=nppiFilter_8u_C1R(oDeviceSrc.data(), oDeviceSrc.pitch(), 

					oDeviceDst.data(), oDeviceDst.pitch(),

					ROI, deviceMask, maskSize, anchor, deviceDivisor[0]);

	oDeviceDst.copyTo(oHostDst.data(), oHostDst.pitch());

	customImageType tmp((unsigned char*)oHostDst.data(), oHostDst.width(),oHostDst.height());

	tmp.save("MyProcessedImage.tif");

I implemented erode and dilate function with no problem…

I also tried to use nppiFilterBox_8u_C1R instead of nppiFilter_8u_C1R and it worked perfectly so I suppose I’m doing something wrong with the mask but I don’t know what…

Thanks in advance.

Crankie · May 25, 2011, 6:10am

Check for the return error code.

Frank_Jargstorff · May 25, 2011, 5:02pm

Hi Donos,

the first thing that caught my attention is, that you seem to be passing a dereferenced device pointer as the scale-factor parameter (last parameter [font=“Courier New”]deviceDivisor[0][/font]):

NppStatus ret=nppiFilter_8u_C1R(oDeviceSrc.data(), oDeviceSrc.pitch(), oDeviceDst.data(), 

                                oDeviceDst.pitch(), ROI, deviceMask, maskSize, anchor, 

                                deviceDivisor[0]);

I’m surprised this doesn’t cause a seg-fault. Anyways, I think the first thing to check would be to simply pass 0 for that value. Ultimately, you probably want to fine a scale factor that somewhat matches the sum of the weights, so that the filter doesn’t change overall brightness.

–Frank

Donos · May 26, 2011, 8:50am

@Crankie : I’ve got a NPP_NO_ERROR code !

@Frank : I just tried again with 0 instead of deviceDivisor[0] and still the same problem…

Thanks to both of you, I thought that nobody would try to suggest anything

gregorylf · August 24, 2011, 9:53pm

Would it be possible to post a sample of your working erode code? I have tried to modify the boxfilter code to no avail - I get the texture bind error. Working with 64 bit CentOS linux. I have modified the ROI and offsets to be well within the image so I am not walking off the edges, etc. Very new at this, but would really like to get erode and dilate working!

Thanks

Donos · August 25, 2011, 8:44am

Hello,

I’m sorry but I rewrote everything as I decided to not to use NPP anymore. I’m now on “regular” CUDA with kernels, grids and threads, etc.

By the way all you need is in my first post as the obfucated part is mainly concerning the custom image type I was working on.

Good luck

gregorylf · August 25, 2011, 2:03pm

Yes indeed! Thank you. My mistake was not allocating the mask into the device properly - as in not at all…

Topic		Replies	Views
nppiFilter_8u_C1R : Error -24 NPP_TEXTURE_BIND_ERROR CUDA Programming and Performance	1	3679	October 28, 2011
NPP - nppiFilter_8u_C1R returns KERNEL_EXECUTION Debug options? CUDA Programming and Performance	6	6851	April 25, 2010
A critical problem with nppiFilter CUDA Programming and Performance	6	7557	February 21, 2013
nppiDilate_8u_C1R return -24 NPP TEXTURE BIND ERROR CUDA Programming and Performance	0	879	November 28, 2012
problem with nppiFilter_8u how to use nppiFilter_8u CUDA Programming and Performance	2	1064	October 6, 2010
cuda beginner - howto simple filter CUDA Programming and Performance	1	4880	November 8, 2010
NPP Box Filtering using NPP CUDA Programming and Performance	1	5898	May 5, 2010
What's wrong with nppiFilter_8u_C1R ? CUDA Programming and Performance	1	12351	March 7, 2011
nppiErode_8u_C1R from NPP library CUDA Programming and Performance	4	6740	April 19, 2013
nppiFilterGauss_8u_C1R CUDA Programming and Performance	0	935	August 30, 2016

Problem with nppiFilter_8u_C1R getting a black output

Related topics