DirectShow Filter / DMO using CUDA

I need to parallelize an existing video for windows codec using CUDA. Im planning to implement the same as either a DirectShow filter or a DMO. DMO would be easier to implement it seems.

i) Would using Directshow filters over DMO for the same give any significant performance benifits/losses as far as CUDA is concerned? My guess is it wont matter.

ii) Is there a sample that I can refer to for implementing a directshow filter or DMO using CUDA… I know there is one for OpenGL filters. If not, what’s the closest sample that can help me.

iii) Are any stats on performance improvements on codecs using CUDA availible


The only example I’ve seen is the CUDA accelerated version of Dirac that wumpus posted in the forum some weeks ago:

He mentioned somewhere in the forum what the speedup looked like, but I can’t recall. You might need to check his posting history. (Or hope he sees this thread. :) )


Yes but this uses the GStreamer framework while I was looking for a DirectShow based filter / DMO. I guess no one has tried (at least openly) to make a DirectShow DMO/codec based on CUDA. Thanks anyways.