Quadro vs GeForce for Parallel Encoding

We’re trying to test multiple parallel encoding on a single GPU device and seem to hit a limit with GeForce that doesn’t exist with Quadro. When initializing a third encoder on GeForce, the call to NvEncOpenEncodeSessionEx fails with NV_ENC_ERR_OUT_OF_MEMORY. This is regardless of video resolution.

So question, is this a true limitation of the “consumer” grade GeForce? Or a bug in our implementation that coincidentally doesn’t trigger on a Quadro device?

The Quadro board in question is a M6000 and the GeForce is a Titan Xp. Latest driver 381.22 running on CentOS 7.3

Yes, true limitation (in driver).
Only >=x2000 Quadros (and Grid and Tesla) support unlimited encoder sessions/streams (see https://developer.nvidia.com/video-encode-decode-gpu-support-matrix).

Okay thank for the info! Although this is annoying for unit tests running in parallel…

Did some tests and it seems that the limitation is per host rather than by device. We have a rig with two Titan Xp and wish to leverage the encoders, I would expect we’d be able to do 4 parallel encoding but we still hit the same restriction. This was tested with the official sample application NvEncoder like so:

These first two session run fine, one on each device

> NvEncoder -deviceID 0 -i /dev/random -size 512 512 -endf 0 -o /dev/null
> NvEncoder -deviceID 1 -i /dev/random -size 512 512 -endf 0 -o /dev/null

The third one, second for device 0, aborts with an error

> NvEncoder -deviceID 0 -i /dev/random -size 512 512 -endf 0 -o /dev/null

The limit seems pretty much arbitrary if it’s per host, and a major problem for our requirements.

Yes, this behavior is officially defined in “application nodes”. Older SDK (<=5) limits to 2*encoder session per system when there is at least one GeForce card in system (license bug - see https://devtalk.nvidia.com/default/topic/800942/gpu-accelerated-libraries/session-count-limitation-for-nvenc-no-maxwell-gpus-with-2-nevenc-sessions-/post/4764210/#4764210).

Cite from Video SDK 5.0 “application notes”:

The current SDK package allows up to two simultaneous encode sessions per system for low-end Quadro and GeForce cards. If the system contains any low-end hardware (even in conjunction with other high-end hardware), only two encoding sessions will be permitted.

Cite from Video SDK 6/7/8 “application notes”:

The licensing policy is as follows:
As far as NVENC hardware encoding is concerned, NVIDIA GPUs are classified into two categories: “qualified” and “non-qualified”. On qualified GPUs, the number of concurrent encode sessions is limited by available system resources (encoder capacity, system memory, video memory etc.). On non-qualified GPUs, the number of concurrent encode sessions is limited to 2 per system. This limit of 2 concurrent sessions per system applies to the combined number of encoding sessions executed on all non-qualified cards present in the system.
For a complete list of qualified and non-qualified GPUs, refer to https://developer.nvidia.com/nvidia-video-codec-sdk.
For example, on a system with one Quadro K4000 card (which is a qualified GPU) and three GeForce cards (which are non-qualified GPUs), the application can run N simultaneous encode sessions on Quadro K4000 card (where N is defined by the encoder/memory/hardware limitations) and two sessions on all the three GeForce cards combined. Thus, the limit on the number of simultaneous encode sessions for such a system is N + 2.