Hi dborisoglebskiy,
We see one inefficiency, and have one suggestion that would be better performance (but requires an algorithm tradeoff):
1.Get rid of extra copies.
a.Instead of explicitly copying the input and output images in your remap function, you should use createVXImageFromCVMat to get vx_image objects that share the same memory as the input/output cv::Mat objects
b.then, you can set these vx_image objects to the correct parameters for the nodes in the graph by using vxSetParameterByIndex (see feature_tracker demo source code for a good example of this).
2. Unfortunately, there is no way to get around the splitting of channels for current OpenVX API to do Remap. However, if you could get by with using a perspective or affine warp instead, that would be much faster and there would be no need to split channels.
Thanks