direct show and CUDA


I’m using CUDA in a static library. This library is either used by an executable, or by a Direct Show filter.

With the executable everything goes fine.
With the filter, I can set the device, see its properties, allocate memory and transfer data to it, but it fails when it comes to cudaBindTextureToArray with “invalid texture reference”, cudaMemcpyToSymbol with “invalid device symbol”, and kernel launch with “invalid device function”.

The texture reference and the constant data (for which I call cudaMemcpyToSymbol ) are global variables.

Any sugestions, hints, ideas?
Thanks, Bogdan

I’m using VC 7.1. .NET framework 1.1

The only time i received those errors was when trying to execute a code compiled with arch_11 on a device not supporting it.

I have a 260 gtx, and not using doubles at all.

I’m having exactly the same problem!

I’m running with Visual Studio 2008, Windows XP SP3, CUDA 2.2

Sometimes I got “invalid argument”, other times I got “invalid device function” error. But the cuda code runs just fine with executables; just when embedded in a DirectShow filter that things went wrong

Any information on this would be much appreciated!

Could you try the MD version of cuda 2.2?(It can be downloaded in some thread over here). And it works with VS2008. I’m just curious if it changes anything.

Hi bog,

I’m going to try it out


I tried it, and made no difference in my case.

I thought that could be a cause, because my library is compiled with “MD”

It seems that CUDA and Direct Show are not very happy together!?

Also checked that it was the same thread for all cuda calls.

Hi bog,

I had the chance to try out my program today, and it is working… CUDA is running just fine.

Some differences in the program from last time I got my “invalid device function” or whatever error:

  1. I was using visual studio 2008 express edition, windows sdk 6.1, cuda 2.2, windows xp professional;

    Now I switched to visual studio 2008 <b>professional</b>, windows sdk 6.1, cuda 2.2, windows xp professional <b>x64 </b>edition (some of the library I am using require x64 setup)
  2. Previously, in my DirectShow filter, I have a internal capture graph which consists of CaptureSource(Webcam)->SampleGraber->NullRenderer;

    Now, I got rid of the internal graph and reads in still image from harddisk instead of capturing and streaming from a webcam.

I’m using /MT for CUDA compilation and /MD for C++ compilation, just as before; haven’t even switched to /MD for CUDA yet and it is already working.

So, either 1 or 2 fixes the problem; I don’t have the proper hardware device at hand to make a verdict, so will try out later and update you.



Hi Yilei,

I’m using VS2005 professional edition, 32 bit xp. The graph includes a video decoder and then video encoder. (Maybe if I read a decoded source from disk, it will work)

Hi bog,

I’ve been playing around on two computers, one with x64 win xp / vs 2008 pro, the other with x32 win xp / vs 2008 express; they are located in different places without proper remote access method, so it took me a while to find things out…

Remember last time I talked about getting rid of the internal graph, streaming video frame from local harddisk rather than from webcam, and all these might be the reason of the fact that “invalid device function” is gone. But it is not.

The sole difference between a working project and the one giving “invalid device function” error is this, at least for me:

Project -> Properties -> Linker -> Advanced -> Entry Point.  In the working project I leave it blank, and in the erroneous one I have DllEntryPoint

I guess ultimately it is calling CUDA code within a dll that is making things ugly.

Really appreciate if someone has an explanation for this… Really lucky to find this glitch while trial-and-error…


It states: “It is recommended that you let the linker set the entry point so that the C run-time library is initialized correctly, and C++ constructors for static objects are executed.”

Manually setting Entry Point to DllEntryPoint triggers the “invalid device function” error; maybe CUDA initialization is not fully completed after all, even though memcpy works?