nppiWarpPerspectiveBatch_8u_C3R results are different with cv2.warpAffine results

First, I used nppiWarpPerspectiveBatch_8u_C3R in cpp to do affine, and I got result picture A. Then I used nppiWarpPerspectiveBatch_8u_C3R in cpp to do affine, and I got result picture B. But A and B are totally different.
Relative cpp codes are :
for (int i = 0; i < camera_num; ++i) {
pBatchList_tmp[i].pSrc = image.mutable_gpu_data();
pBatchList_tmp[i].nSrcStep = image.width_step();
pBatchList_tmp[i].pDst = img_warp_perspective + i * image_mem_size;
pBatchList_tmp[i].nDstStep = image.width_step();
pBatchList_tmp[i].pCoeffs = coeffs_gpu;
}
nppiWarpPerspectiveBatchInit(pBatchList, 6);
nppiWarpPerspectiveBatch_8u_C3R(image_size, rc, rc, NPPI_INTER_LINEAR, pBatchList, 6);

Relative python codes are :
data_affine_out_python = cv2.warpAffine(np.float32(img_cv), np.float32(homo_mat), (img_cv.shape[1], img_cv.shape[0]), borderMode=cv2.BORDER_TRANSPARENT)

Coeffs_gpu and homo_mat are the same, which are:
[[1.50023629,0.0217391304,-633.545369],
[0.00543478261,1.5,-330.456522],
[0,0,1]]

Besides, I found the original point in opencv is on the top-left of the picture and the original point in nppi is on the center of picture, which are different. Will this affect the affine results?

Thanks lot!