Simple Example to create DLL with NVCC

Hi to all,
i need some examples to create a DLL with CUDA.
can somebody help me? A very simple DLL, with sources and directives for nvcc
Thanx :lol:

Nobody cuda architect to help me??? :(
Is It possible that nvidia developers don’t help for enlarge the number of cuda user???

Well, you asked for sources and everything…

Information transfer usually happens by “Pass by refernce”. People dont “Pass by value”.

So, the only reference i am gonna give u is
Create a DLL project and use CUDA inside. Refer MSDN if u dont know how to create a DLL.

Ok, I tried to create a dll but, Visual 2008 c++ gives me an error result:
I attach the Zip file, so it possible to resolve this problem (the same problem that many developers have…
It is a dll project to use a trivial custom cuda function. “summdll.cpp” export “Somma2” function whose call chiamaSomma2 which use the “inc_array2” cuda function. Now, it isn’t important I don’t allocate memory to variables, but it’s important build this dll, to call Somma2.
Can you resolve it? :o
Thanx!!! :)

I dont have VS 2008. I have only VS 2005.

Note vS 2008 is supported only by CUDA 2.2, I think.

I see you have copied all CUDA include files into your folder. This is NOT necessary at all. You could just set the INCLUDE path to $(CUDA_INC_PATH) (see what echo %CUDA_INC_PATH% says in cmd prompt. It must be defined if u have installed CUDA)

Have you set custom build rule for the CU file to use “nvcc” for compilation? In that dialog, you also need to specify output OBJ file so that the linker can collect it and link it.

Ok, I maybe resolve it, with vs 2008. Can you post an attached file with an your complete solution?
So to compare solutions and be sure (i’m trying with emulator mode…now)
Thanx very much :)

Before I give you - Have you ever compiled a CUDA application succesfully (not a DLL)?

Yes, in emulation mode. shortly, i will use a GF GTX260. (cuda 2.2)

So, you know how to set build rules, etc…? Is that right?

Try compiling an application for device mode before transitioning to DLL.

This not! VS 2008 build the dll. It is strange, i tried to build a dll in release mode, it build the dll (of course) but i can use the exported symbol, strangely!!
Eeven if I do not have a GF GTX260

May b, your CU file was compiled with --deviceemu

Mah…I wrote a very bad English…I don’t think it is builded in emulation mode, beacause when i build it VS gives me an error if I insert the Sleep function in Global section… :o

Ok, May b, I will send you a video documentation by weekend.

You’re a Great Friend!!! :) :D
I’m waiting you! :)
Thanx!! :)


Please look @ the attachments. 2 WMV files.

One tells about DLL creation with CUDA
Other tells about APP creation that uses the DLL above

Hope that helps.

If u have windows media player (9 or abov) on your PC, I think the WMVs will play witout CODEC problems/.
Let me know if u r facing codec issues
You could install the K-Lite Mega codec pack (it was listed in techrepublic and worked well for me)

Hi Sarnath,
I must thank you for your help and your little big work!! :) Very very very Thanx! :)
How you can see, many people downloaded your tutorial! :) It is a satisfaction, no? ;)
If I can help you in some way (especially in language delphi), tell me :)
Tomorrow I will try to build a dll, with your way!
Many thanks again,
Guido :)

I am downloading it too. I never needed to create DLLs but it may be handy at some point :)
Many thanks

Guido, PDan,

Thanks! It is a small little work. Glad if it was of any help!

Best Regards,

Thanks for posting this video Sarnath! I’ve been keeping this topic in the back of my mind since I saw it created. I have just followed the videos directions and got my project working as a dll.

However, I notice a pretty large difference in execution time (orig 300ms, now 600ms) when calling my functions though the DLL as opposed to directly calling it in the console application. Has any else seen this before? What do most people do to package their code?



I think that latency is applicable only for the first call into the DLL. Subsequent calls should be faster.

The first call to DLL usually involves lot of patching around (relocation) and that probably explains the huge time taken…

Can you confirm this?

Also, I am hoping your app and DLL are both compiled in “Release” mode.