How to solve low fps when I load multiple videos(6 videos, 2048x2048 192MB per video)?

I am tried load 6 videos(2048x2048 192MB 11seconds per video) with the same way of CUDA sample cudaDecodeGL.
Every parameter of CUDA I set was the same of this sample. Finally, I just got 30 fps by Fraps(I played the 6 videos synchronized).
It is too low… If I should change some parameters or flags of this sample.
I really don’t know the decoder and parser how to do work because I just set the video file path to the video source object, and then use the parser and decoder to decode video…I want to know when I load 6 videos, and there is 6 videoSource objecs, 6 videoDecoder and videoParser、videoFrameQueue objects, is it paralleling?


Can you let us know the pure decoding speed? Please switch off display from the application during doing the measurement.

You can also refer to …\Samples\AppDecode\AppDecMultiInput in SDK8.1 for programming multiple instance decoding. When running multiple decoding instances all the individual instances ideally should run in parallel.

Ryan Park

Ok…Thank for your reply.

There is a new question below:
I have ran the sample AppDecGL, and I want to use Nsight to analysis the program. But when I choose the OpenGL in Trace Settings, the Nsight will do not active, and the original sample program will break.

How do I can fix it?

I have solve the problem by update the glew and glut lib, but only win32…
The x64 version can not active Nsight still…

And is there a question that cuda can support win32 of the AppDecGL sample?