nvEncLockBitstream low performance

Hi everyone,

I’m trying to capture my desktop using Desktop Duplication API, encode the D3DTexture2D using NVENC and send it over the local network. The performance of everything is very high until I reach the part where we need to lock the bitstream and extract the data. Below is the code used:

NV_ENC_LOCK_BITSTREAM lockBitstreamData = { NV_ENC_LOCK_BITSTREAM_VER };
        lockBitstreamData.outputBitstream = vOutputBuffer[m_iGot % m_nEncoderBuffer];
        lockBitstreamData.doNotWait = false;
        auto starti = std::chrono::system_clock::now();
        NVENC_API_CALL(m_nvenc.nvEncLockBitstream(m_hEncoder, &lockBitstreamData));
        auto end = std::chrono::system_clock::now();
        std::chrono::duration<double> elapsed_seconds = end - starti;
        std::time_t end_time = std::chrono::system_clock::to_time_t(end);

        std::cout << "finished computation at " << std::ctime(&end_time)
            << "elapsed time: " << elapsed_seconds.count() << "s\n";
        uint8_t *pData = (uint8_t *)lockBitstreamData.bitstreamBufferPtr;

        if (vPacket.size() < i + 1)
        {
            vPacket.push_back(std::vector<uint8_t>());
        }
       
        vPacket[i].clear();
        vPacket[i].insert(vPacket[i].end(), &pData[0], &pData[lockBitstreamData.bitstreamSizeInBytes]);
        i++;

         NVENC_API_CALL(m_nvenc.nvEncUnlockBitstream(m_hEncoder, lockBitstreamData.outputBitstream));

The “NVENC_API_CALL(m_nvenc.nvEncLockBitstream(m_hEncoder, &lockBitstreamData));” takes anything from under 10ms when under desktop at low load to an average of 90ms when I run a game in full screen under heavy load. Our constraints require “real-time” 60fps so anything over 16ms is too high. Is there a way to get that down?

5 Likes

hi,have you solved this problem now, I ran into the same problem?