Hello,
I’m working on an image processing program and I want to implement convolution via NPP.
I’m facing the problem that whatever I pass to the nppiFilter_8u_C1R function, I get a black output (even thought it is correctly sized).
Can someone help me ?
Here is the code (with little obfuscation because I’m not free to share it) :
customImageType image("My image.tif");
float m [] ={0 , 1, 0, 1, -4, 1, 0, 1, 0};
customImageType mask(m, 3, 3); //instanciation of the mask image
image.renormalize(0,255); //renormalization and cast to unsigned int;
//allocation of source image on CPU with NPP structure
npp::ImageCPU_8u_C1 oHostSrc(image.width, image.height);
memcpy(oHostSrc.data(), image.data, image.nbPoints*sizeof(unsigned char));
//allocation of mask on GPU with CUDA
mask.castToLong();
Npp32s *deviceMask;
cudaMalloc((void**)&deviceMask, mask.nbPoints()*sizeof(long));
cudaMemcpy(deviceMask, mask.data, mask.nbPoints*sizeof(long), cudaMemcpyHostToDevice);
NppiSize maskSize = {mask.width,mask.height};
NppiSize ROI = {image.width - mask.width + 1, image.height - mask.height + 1};
//allocation of the destination image on GPU
npp::ImageNPP_8u_C1 oDeviceDst(ROI.width, ROI.height);
//allocation of the destination image on CPU
npp::ImageCPU_8u_C1 oHostDst(oDeviceDst.size());
NppiPoint anchor = {0,0};
Npp32s* divisor = new Npp32s[1];
divisor[0] = (Npp32s)mask.sum();
Npp32s* deviceDivisor;
cudaMalloc((void**) &deviceDivisor, sizeof(Npp32s));
cudaMemcpy(deviceDivisor, divisor, sizeof(Npp32s), cudaMemcpyHostToDevice);
//allocation of source image on GPU by copy of CPU image
npp::ImageNPP_8u_C1 oDeviceSrc(oHostSrc);
NppStatus ret=nppiFilter_8u_C1R(oDeviceSrc.data(), oDeviceSrc.pitch(),
oDeviceDst.data(), oDeviceDst.pitch(),
ROI, deviceMask, maskSize, anchor, deviceDivisor[0]);
oDeviceDst.copyTo(oHostDst.data(), oHostDst.pitch());
customImageType tmp((unsigned char*)oHostDst.data(), oHostDst.width(),oHostDst.height());
tmp.save("MyProcessedImage.tif");
I implemented erode and dilate function with no problem…
I also tried to use nppiFilterBox_8u_C1R instead of nppiFilter_8u_C1R and it worked perfectly so I suppose I’m doing something wrong with the mask but I don’t know what…
Thanks in advance.