M4000 simultaneous h264 encode sessions problem

Hello People,

im having some troubles with encoding limits sessions on a m4000 card

i can max handle 16 x 720p streams on a m4000 card what i’ve heard is it can be handle minimum 30x 720p streams.

im using the next command does someone knows about this or can me help out?

ffmpeg -c:v h264_cuvid -deint 2 -vsync 0 -loglevel warning -probesize 15000000 -analyzeduration 10000000  -i "udp://127.0.0.1:1001?fifo_size=1000000&overrun_nonfatal=1" -map 0:2 -map 0:3 -b:v 2000k -b:a 96k  -maxrate 2800k  -bufsize 3000k -c:v h264_nvenc -filter:v hwupload_cuda,scale_npp=w=1280:h=720:format=nv12:interp_algo=lanczos,hwdownload,format=nv12  -acodec aac   -f flv rtmp://127.0.0.1:1935/rtmp/test

If you have performance problem you should check and find bottleneck. Try to use “nvidia-smi dmon” (you are utilizing memory, SM(CUDA), DEC, ENC and pclk is needed too) and “top” (CPU). I am predicting bottleneck in DEC not in ENC.

If I look to new released VideoSDK8.0 NVENC_Application_Note.pdf (Table 3. NVENC encoding performance for 1920p) + Using_FFmpeg_with_NVIDIA_GPU_Hardware_Acceleration.pdf + https://developer.nvidia.com/video-encode-decode-gpu-support-matrix + List of Nvidia graphics processing units - Wikipedia for ENCODER:

  • M4000 (GM204, Maxwell Gen 2), no “-preset” option (please check “ffmpeg -h encoder=nvenc” for defaults) - defaults to “hq” + “single pass” ? (check Table 3. for FPS) => 267 FPS
  • 720p instead 1080p = (19201080)/(1280720) => *2.250
  • throttled to 800 MHz (by wiki) instead 1366 MHz (check Table 3. for MHz) = 800/1366 => *0.585
  • 2x encoder (by matrix) => *2

2672.2500.585*2 = 702.876 FPS

  • 702.876 FPS / 60 FPS ? = 11.714 streams
  • 702.876 FPS / 30 FPS ? = 23.429 streams

Try “High Performance” + “Constant QP” => 396 FPS

3962.2500.585*2 = 1042.470 FPS

  • 1042.470 / 60 FPS = 17.374 streams
  • 1042.470 / 30 FPS = 34.749 streams

PS: citation from NVENC_Application_Note.pdf: “Encoder performance depends on many factors, including but not limited to: Encoder settings, GPU clocks, GPU type, video content type etc. Performance reported in SDK 7.1 and earlier SDK versions was measured using content which typically yields higher fps. Starting SDK 8.0, we are reporting average performance between best & worst case content.”