NVEncode API nvEncLockBitstream() time consuming

HI, guys

I am working around with the video sdk to run a low-latency application.
and I found nvEncLockBitstream() cost about 7ms when encoding 720p rgba frame in Synchronous Encode Mode.
I think 7ms is pretty too much for HW encoder, but I am not sure whether it is the way it designed. Or just coming from the wrong setting.

Someone has a clue?

Just printing 1 line of text to console costs 1.4 ms.