Several color images were tested using Gaussian functions, with sizes of 4096 * 10000 or 8192 * 10000, which did not work properly. However, when the image was slightly smaller, such as 1200 * 2000, it could work properly;
2、The result image is the same as the original image, without any changes changes
3、extern “C” int nppi_GaussFilter(unsigned char* pSrcData, unsigned char* pDstData, int iHeight, int iWidth, int iChannel)
{
NppStatus t_NppStatus;
int srcElements = iHeight * iWidth * iChannel;
int dstElements = iHeight * iWidth * iChannel;
You haven’t properly offset your image to allow for filter mask positioning.
You cannot run a mask on every input pixel. The mask will cross over the image boundary, resulting in illegal access. Your code is broken.
I give an example of image offsetting in the linked example I provided. Yes, I understand it appears to work in some cases. Run any cases you like with compute-sanitizer if you want more info. Your posted code throws errors in compute-sanitizer even for image size of 1000x1000.
You may also wish to use proper CUDA error checking.