Background:
Right now we have a server setup with three (3) Nvidia p100, mainly for video encoding/decoding using nvenc/nvdec, we are using Ffmpeg compiled against Cuda.
cat /usr/local/cuda/version.txt
CUDA Version 9.1.85
Everything goes fine if i start lets say 12 ffmpeg processes for live encoding from h264 to hevc using ffmpeg all of them on card “0”.
no problems, everything goes fine.
but as soon as i start using any of the other two nvidia cards ( 1 or 2 ) the output fps on the first 12 process (runing on card 0) goes down.
As far as i understand this shouldn’t be an issue because each ffmpeg process is assigned to a different card, i’m using hwaccel cuvid & scale_npp from ffmpeg.
Topology:
GPU0 GPU1 GPU2 CPU Affinity
GPU0 X SYS SYS 0-11,24-35
GPU1 SYS X NODE 12-23,36-47
GPU2 SYS NODE X 12-23,36-47
Legend:
X = Self
SYS = Connection traversing PCIe as well as the SMP interconnect between NUMA nodes (e.g., QPI/UPI)
NODE = Connection traversing PCIe as well as the interconnect between PCIe Host Bridges within a NUMA node
PHB = Connection traversing PCIe as well as a PCIe Host Bridge (typically the CPU)
PXB = Connection traversing multiple PCIe switches (without traversing the PCIe Host Bridge)
PIX = Connection traversing a single PCIe switch
NV# = Connection traversing a bonded set of # NVLinks
nvidia-smi -q | grep Link -A2
GPU Link Info
PCIe Generation
Max : 3
Link Width
Max : 16x
Current : 16x
GPU Link Info
PCIe Generation
Max : 3
Link Width
Max : 16x
Current : 16x
GPU Link Info
PCIe Generation
Max : 3
Link Width
Max : 16x
Current : 16x
Any help will be greatly appreciated,
thanks in advance.