Is this a bug in nppiResizeSqrPixel_16u_C4R() regarding oDstROI or have I misunderstood?

Before filing a bug report I thought I’d ask here so that I haven’t misunderstood how this NPP funcion is supposed to work.

I am trying to use the destination region-of-interest “oDstROI” parameter to nppiResizeSqrPixel_16u_C4R() to perform a ‘crop’ operation once the ‘scale’ and ‘shift’ operations have been done (I realize they are not separate sequencial operations but it can help to think of them as such for the purposes of this question). I have succeeded in using the source ROI “oSrcROI” to do a ‘crop’ operation, but I really want to use the destination ROI instead.

I use nppiGetResizeRect() with oSrcROI set to the entire input image, nXFactor and nYFactor set to 0.45, nXShift and nYShift about 10% of original image size, and eInterpolation set to NPPI_INTER_CUBIC.

I then run nppiResizeSqrPixel_16u_C4R() with oDstROI set to the rectangle i received as answer from nppiGetResizeRect(). The other parameters to nppiResizeSqrPixel_16u_C4R() are set in the same way as described for nppiGetResizeRect() above.

What I would expect is that nppiResizeSqrPixel_16u_C4R() scales and shifts, then masks out a region of interest that matches where the scaled and shifted image ended up.

What I get instead is only part of the scaled and shifted image, because nppiResizeSqrPixel_16u_C4R() seems to place the oDstROI incorrectly.

I realize that this use case sounds a bit odd, that the ROI masking in this case would be pointless, but the end goal here once the “bug is fixed” is to set a smaller ROI than the rectangle from nppiGetResizeRect(), thereby achieving a crop functionality.

I am running CUDA 12.1 on Linux.

Have I misunderstood how the oDstROI is supposed to work? If not, has anyone else ever observed this? As I said in the beginning, if this is a bug I can file it, but perhaps I misunderstood how oDstROI is supposed to work.