I was wondering if it is possible to execute multiple NPP functions on GPU in parallel?
I’m using the function
NppStatus nppiNormDiff_L2_8u_C1R ( const Npp8u * pSrc1, int nSrcStep1, const Npp8u * pSrc2, int nSrcStep2, NppiSize oSizeROI, Npp64f * pRetVal )
on an image and an image patch. The problem is, that the size of the matrices are very small (16x16), so that I can’t make much use of the GPU performance (in fact it’s slower than on CPU). The function gets called a lot, because I need to compare the norm of about 200x3200 patches. So I would like the nppiNorm function to be called multiple times in parallel. Is that possible?