Nvidia A2 and A10 video encoding performance benchmarks and performance recommendations?

I am looking for some benchmarks/guidance on the A2 and A10 GPUs for H.264 encoding.

I need to spec a GPU (or a pair of GPUs - not more than 2) that can handle inline encoding of fourteen1080P video streams. I receive either 10-bit1080i60 or 10-bit1080P30 over SDI, then H.264 encode it.

From this T4 benchmark: https://developer.nvidia.com/blog/turing-h264-video-encoding-speed-and-quality/

Two T4s should be adequate to provide some head room, but would prefer an ampere card to get the PCIe 4 bandwidth.

The A2 does not seem like a direct replacement of the T4 with much lower specs and am skeptical it has the same capacity as the T4.

The A10 roughly seems to be about double the performance of the T4. With no benchmarks I would think I could achieve at least the same numbers as two T4s based on the benchmark I linked above. I don’t know if this is a valid conclusion or not.

Doing an evaluation with Nvidia is not an option. This application requires many PCI slots and the line of servers that I use do not have a relevant Nvidia certification for any model with an adequate number of PCI slots for any relevant model of GPU, and very few certifications at all for anything smaller than an A30.

Does anyone have any comments on the performance of these two cards in this application?

Hi there!

Are you aware of our Video Codec SDK? As part of the documentation you will find also an overview of raw encode performance compared across different GPU generations. Look for “NVENC Performance” on the NVENC Application Notes.

Another bit of information is the GPU NVENC/NVDEC support matrix, where you can see that T4 and A2/A10 have the exact same number of NVENC chips (1) and capabilities, just different Chip generations. So other differences aside, you should still expect the same or even better performance with either A2 or A10.

I hope that helps.

I was aware of the SDK, but not the documentation you provided - this is fantastic, exactly what I needed. I had searched around a bit but indexing isn’t fantastic.

The biggest understanding gap I had was that encoding is totally offloaded on the nvenc and have been busy chasing cuda core count.

1 Like

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.