nppiFilterGaussPyramidLayerUpBorder does not work

I have a 512x512 image and want to upsample it to 1024x1024 using nppiFilterGaussPyramidLayerUpBorder_8u_C1R. Debug on Tegra X2, cuda-9.0.

First, I use nppiGetFilterGaussPyramidLayerUpBorderDstROI to caculate ROI for the umsampled image:

NPP_CHECK_NPP ( nppiGetFilterGaussPyramidLayerUpBorderDstROI (
512,
512,
&ROI_dstMin,
&ROI_dstMax,
2.0F ) );

Then:
NPP_CHECK_NPP (

            nppiFilterGaussPyramidLayerUpBorder_8u_C1R ( oDeviceSrc.data(),
            											 oDeviceSrc.pitch(),
            											 oSrcSize,
            											 oSrcOffset,
            											 oDeviceDst.data(),
            											 oDeviceDst.pitch(),
            											 ROI_dstMax,
            											 2.0F,
            											 6,
            											 pKernel,
            											 NPP_BORDER_REPLICATE) );

The first function produces correct result. But after entering the second function debugging process got suspended with no error code produced. The debug cannot be canceled from nsight eclipse and rebooting of board is required. What could be the cause of the failure? Thank you.