I’m trying to debug an issue with our application which uses libavcodec with nvenc codec; eventually I ended up with a test code derived from NvEncoderPerf NVENC sample (https://gist.github.com/tea/02d9df6e58a72f78096e7404e44967cc). It is mostly original sample with following changes:
- synchronous mode instead of async
- no file I/O (static image input, encoded bitstream is ignored)
- 2 instances of the encoders (CNvEncoderPerf class) running in their own threads
- first instance runs single encoding session indefinitely
- second instance runs encoding sessions in a loop; each session runs for about 1 second
Every once in a while during second instance deinitialization (during nvEncDestroyInputBuffer) the first instance crashes deep inside nvEncUnlockInputBuffer; of course, the input buffer referenced by nvEncUnlockInputBuffer is not the one being destroyed, nor have it been destroyed before. The instances run in different threads, they have separate NVENC contexts as well as separate CUDA contexts (though both sessions use same CUDA device) - so there shouldn’t be any shared state between the two instances, at least no user-visible state. Also, as I mentioned earlier, it only happens occasionally after several successful iterations thus hinting at some race condition.
Is there some known limitation of nvenc with respect to multithreading (and thus a bug in my code)? Or is it a bug in nvenc/cuda? I’m a bit puzzled by this issue - I’d expect many uses of nvenc are supposed to be in similar fashion (with many threads starting, running and destroying separate encoding sessions).