NvBuffer Transform much slower than OpenCV GPU

The execution is on hardware converter and for having optimal throughput, please run it at max clock:
Nvvideoconvert issue, nvvideoconvert in DS4 is better than Ds5? - #3 by DaneLLL

This shall bring NvBufferTransform()/NvBufferComposite() in maxumum throughput. Besides, if you have multiple threads calling NvBufferTransform(), please create NvBufferSession