8K 30fps 44Mbps HEVC performance bottle neck GPU p4000

Hi experts,

I want to support 3 to 4 session of decoder on GPU p4000’s.

My input is 8k 30fps 44Mpbs HEVC stream, but decoder is running slow and for 2 session only fps drops to 27 to 28 fps for decoding.

my system specification :-

CPU Specification : Intel® Xeon® Silver 4108 CPU @ 1.80GHz
GPU Specification : Nvidia p4000.

single session performance

  1. 54 fps (session 1)

two session performance

  1. 28 fps (session 1)(reduced to 28 from 54 fps)
  2. 27 fps (session 2)

three session performance

  1. 18 fps (session 1)(reduced to 18 from 28 fps)
  2. 17 fps (session 2)(reduced to 18 from 27 fps)
  3. 18 fps (session 3)

Can you suggest any GPU version and CPU version for 8k 44mbps 30fps, yuv420 8-bit , which support for at-least 3 sessions without dropping fps performance.?


P4000 has the capability of roughly decoding a single 8K@30 content. You cannot decode multiple 8K@30 streams Realtime on Pascal/Volta or earlier GPUs.

However, the newly launched Turing GPUs have improved HEVC performance. TU104 and TU106 Tesla/Quadro cards have 2 and 3 NVDECs respectively which also means additional throughput.

Tesla T4, Quadro RTX 4000, Quadro RTX 5000 have multiple NVDECs. Hence you should easily be able to decode 3-4 8K@30(OR even more number of streams) streams on those GPUs.

If you have the above mentioned GPUs, the shipping drivers for the GPUs will give you your required throughput.

Ryan Park