NVENC on Quadro K4200 - high CPU usage with many threads


I am using NVENC on Quadro K4200 to run multiple parallel encoding sessions. What seems to be weird is that with for example 16 or 25 threads running in parallel I get quite a high CPU usage even though the encoding is supposed to be hardware accelerated.

I performed performance analysis to exclude my own errors and found that most of the time was wasted on nvencodeapi64.dll calling NtGdiDdDDIRender in gdi32.dll. Is there something that I may be doing wrong that causes this bottleneck? Any advices?

How are you calling the multiple sessions?