Hi,
It is capability of hardware encoder and you can achieve the performance by running gstreamer command like:
How can I customize to use non-blocking mode nvv4l2h264enc - #8 by DaneLLL
For using OpenCV there is additional memory copy and it may dominate the performance. Also in multi-thread case somehow the CPU cores are not fully loaded. Seems like the multithreading is not efficient in OpenCV.