nppi function computes wrong buffer size

Hello,
I’m using a nppi function to compute de minimum of an image with height 78 and width 139.
When I enable the memchecker, I find one store exception.
I found out, that the buffer should be 80 instead of 79.

The code is:

MyImage<uint8_t> watermarkTemplate("my path");
watermarkTemplate.memCopyD2HPitched();

NppiSize roiSize; uint8_t* d_nppMinBuf; uint8_t* d_minimum;

int nppError = 0;

int nppMinBufSize;

roiSize.height = watermarkTemplate.getRowCount();
roiSize.width = watermarkTemplate.getColumnCount();

nppError = nppiMinGetBufferHostSize_8u_C1R(roiSize, &nppMinBufSize);

if(nppError != 0){
	cout<<"npp error: "<<nppError<<"\n";
}

CudaSafeCall( cudaMalloc(&d_nppMinBuf, nppMinBufSize) );
CudaSafeCall( cudaMalloc(&d_minimum, sizeof(*d_minimum)) );

nppError = nppiMin_8u_C1R(watermarkTemplate.getDevPitchedPtr(), watermarkTemplate.getPitchDevice(), roiSize, d_nppMinBuf, d_minimum);

if(nppError != 0){
	cout<<"npp error: "<<nppError<<"\n";
}

CudaSafeCall( cudaFree(d_nppMinBuf) );
CudaSafeCall( cudaFree(d_minimum) );

MyImage ist just a class that holds the pointer to the image (and does the memory allokation).

Is it a bug?

Hello, Yaro. Thanks for the reporting. In order for us to investigate this issue, I would like to collect more information on your end. Which CUDA SDK are you running into this problem with? Is it 5.0? What about the GPU card you are using?

It’s 5.0 with a gtx460se

NPP recently changed the buffer computation approach. Unfortunately, it’s going to take effect in the next release (the one after 5.5). Besides, the performance is also enhanced for the primitive you are using in the code.

Thank you.