Problem with nppiWarpAffine

I’m trying to use nppiWarpAffine to warp an image. My src and dst images are 640x480. For some combinations of coeff, I get an NPP_WRONG_INTERSECTION_QUAD_WARNING and the image doesn’t display. In the example below, if I set the last element of coeff to 0, it works fine, but if it is -1, I get the error. Mathematically, both should work. I suspect I have to modify rectIn or rectOut, but I can’t seem to get it to behave properly. I tried these exact parameters with ippiWarpAffine and it works perfectly. Anyone have any suggestions?

int widthIn = 640;

int heightIn = 480;

int widthOut = 640;

int heightOut = 480;

double coeff[2][3] = {{1.16,0,-57},{0,1.16,-1}};

NppiSize sizeIn = {widthIn, heightIn};

NppiRect rectIn = {0,0,widthIn,heightIn};

int pitchIn = widthIn*sizeof(float); 

NppiRect rectOut = {0,0,widthOut,heightOut};

int pitchOut = widthOut*sizeof(float);  

NppStatus eStatusNPP = nppiWarpAffine_32f_C1R(in_d, sizeIn, pitchIn, rectIn, 

                             out_d, pitchOut, rectOut, coeff, NPPI_INTER_LINEAR);

That sounds like a bug. We’ll investigate. If our investigation turns up a work-around, I’ll post it here.

I just tried the same code in CUDA 4.1rc2 and it gets the same error.