NVENC higher latency with low frequency frame input

I am making realtime streaming app sample and experiencing higher latency when i input raw frame with low frequency.

I made sample code using NvEncoderD3D11 in Video Codec SDK Sample

I created encoder with nExtraOutputDelay=0 to remove output delay
(with default nExtraOutputDelay=3, encoder gives first encoded packet after 3 raw frame input)

When i supply raw frame to this encoder with no sleep, encoding time is about 1.2ms
but when i supply raw frame with 20ms (50fps) interval, encoding time became about 4.3ms

Following is test code

Program.exe : supply frame with no sleep
Program.exe 1 : 20ms (50fps) interval using Win32 Event
Program.exe 2 : 20ms (50fps) interval using Kernel32::Sleep
Program.exe 3 : 20ms (50fps) interval using std::condition_variable
Program.exe 0 20 : 20ms (50fps) interval using spinlock

HANDLE event = CreateEvent(NULL, FALSE, FALSE, NULL);

bool state = false;
std::mutex mutex;
std::condition_variable signal;

int main(int argc, char* argv[]) {
	bool is_done = false;
	std::thread t([&is_done] {
		while (!is_done) {
			Sleep(20);
			SetEvent(event);
			
			mutex.lock();
			state = true;
			mutex.unlock();
			signal.notify_one();
		}
	});
	
	Microsoft::WRL::ComPtr<ID3D11Device> D3DDevice;
	{
		Microsoft::WRL::ComPtr<IDXGIFactory1> DXGIFactory;
		CreateDXGIFactory1(__uuidof(IDXGIFactory1), &DXGIFactory);
	
		Microsoft::WRL::ComPtr<IDXGIAdapter1> DXGIAdapter;
		for (UINT i = 0; DXGIFactory->EnumAdapters1(i, &DXGIAdapter) != DXGI_ERROR_NOT_FOUND; i++) {
			DXGI_ADAPTER_DESC1 desc = { 0, };
			DXGIAdapter->GetDesc1(&desc);
	
			if (desc.VendorId != 0x10DE)
				continue;
	
			D3D11CreateDevice(DXGIAdapter.Get(), D3D_DRIVER_TYPE_UNKNOWN, nullptr, NULL, nullptr, NULL, D3D11_SDK_VERSION, &D3DDevice, nullptr, nullptr);
	
			break;
		}
	}
	
	NvEncoderD3D11* pEncoder = new NvEncoderD3D11(D3DDevice.Get(), 1920, 1080, NV_ENC_BUFFER_FORMAT_NV12, 0);
	
	NV_ENC_INITIALIZE_PARAMS encInitParams = { 0 };
	ZeroMemory(&encInitParams, sizeof(encInitParams));
	
	NV_ENC_CONFIG encConfig = { 0 };
	ZeroMemory(&encConfig, sizeof(encConfig));
	encInitParams.encodeConfig = &encConfig;
	encInitParams.encodeWidth = 1920;
	encInitParams.encodeHeight = 1080;
	encInitParams.maxEncodeWidth = 1920;
	encInitParams.maxEncodeHeight = 1080;
	
	pEncoder->CreateDefaultEncoderParams(&encInitParams, NV_ENC_CODEC_H264_GUID, NV_ENC_PRESET_LOW_LATENCY_HP_GUID);
	pEncoder->CreateEncoder(&encInitParams);
	
	
	uint64_t cnt = 0;
	uint64_t val = 0;
	
	while (true) {
	
		auto t1 = std::chrono::high_resolution_clock::now();
		{
			const NvEncInputFrame* pEncInput = pEncoder->GetNextInputFrame();
			ID3D11Texture2D* pEncBuf = (ID3D11Texture2D*)pEncInput->inputPtr;

			// Do nothing

			std::vector<std::vector<uint8_t>> vPacket;
			pEncoder->EncodeFrame(vPacket);
		}
		auto t2 = std::chrono::high_resolution_clock::now();
		cnt++;
		val += std::chrono::duration_cast<std::chrono::microseconds>(t2 - t1).count();
		if (cnt % 100 == 0)
			printf("Time Elapsed : %6.2f (%6.2f)\n", val / 1000.0f / cnt, std::chrono::duration_cast<std::chrono::microseconds>(t2 - t1).count() / 1000.0f);
	
	
		if (argc > 1 && argv[1][0] == '1')
			WaitForSingleObject(event, INFINITE);
	
		if (argc > 1 && argv[1][0] == '2')
			Sleep(20);
	
		if (argc > 1 && argv[1][0] == '3') {
			std::unique_lock<std::mutex> lock(mutex);
	
			while (!state)
				signal.wait(lock);
	
			state = false;
		}
	
		auto t0 = std::chrono::high_resolution_clock::now();
		while (argc > 2)
			if (std::chrono::high_resolution_clock::now() - t0 > std::chrono::milliseconds(std::atoi(argv[2])))
				break;
	}
}

I found that when i execute at least one test program with no interval
latency of the other test programs significantly decreased.

It seems that Nvenc come into sleep state(?) if there is no input frame more than 2ms~

Is there any solution for this issue…?

Dynamic clock scaling and GPU boost apply to the video clock too.