Speed up the decoding based on NvDecodeGL sample


I am trying to achieve a speed ~540fps in decoding HD h264 videos using nvidia gpu. The speed ~540 fps on HD is based on this presentation file page 10:

I tried many settings but the best I can get is 310 fps decoding speed. I am using a GeForce GTX 970 graphic card. I tried to decode a HD 30fps video. Here is the configurations I am using to get 310 fps decoding speed:

  1. Disable the frame rendering when decoding. I did this by setting g_bUseDisplay = false in renderVideoFrame() method inside NvDecodeGL.cpp, near line 1540. If I don’t do that the decoding speed is less than 100fps.

  2. In generating test h264 video using ffmpeg:

    • I set resolution to be 1920x1080
    • I don’t use b frame
    • I use main profile
    • I use fastdecode option
      This is my ffmpeg transcoding command I used in generating the test HD video, taking a 4K video as the input video:
ffmpeg -i jellyfish-120-mbps-4k-uhd-h264.mkv -c:v libx264 -profile:v main -vf scale=1920:1080 -b:v 50000k -an -tune fastdecode -coder 0 -flags -loop -g 30 -bf 0 -t 10 jellyfish-hd-50mbps_main_nosound_fast_short.mp4

I wonder is there a special sample video I can use, or is there a particular setting I need to use in writing the decoder code so that I can get the 540 fps decoding speed?

Thank you very much!

Sorry, I don’t have an answer, but I am wondering whether those 540 fps in the GM204 column could be a cut & paste error from the GM206 column, or whether they might only be achievable with the most powerful GM204 GPU, which would be the GTX 980 (whereas you use the GTX 970)?

Hi njuffa, thanks for your reply! I tried the same experiment on a GTX 1070 card. It is even slower (~200fps). Will try it again when I get a GTX 980.