Here is an extremely simple example, using three source files: cuda_dll.cu contains the DLL source code, cuda_dll.h exports a simple C-style API, and cuda_main.cu calls the DLL. This example uses the CUDA runtime.
cuda_dll.h:
#ifdef CUDADLL_EXPORTS
#define DLLEXPORT __declspec(dllexport)
#else
#define DLLEXPORT __declspec(dllimport)
#endif
extern "C" DLLEXPORT void wrapper (int n);
cuda_dll.cu:
#include <stdio.h>
#include <stdlib.h>
#include "cuda_dll.h"
__global__ void kernel (int n)
{
printf ("kernel: n = %d\n", n);
}
void wrapper (int n)
{
printf ("wrapper: calling kernel()\n");
kernel<<<1,1>>>(n);
cudaDeviceSynchronize();
}
cuda_main.cu:
#include <stdio.h>
#include <stdlib.h>
#include "cuda_dll.h"
int main (void)
{
printf ("main: calling wrapper()\n");
wrapper (5);
fflush (stdout);
return EXIT_SUCCESS;
}
Build the DLL, which creates two files (as is the custom on Windows): actual shared library cuda_dll.dll and interface stub library cuda_dll.lib:
nvcc --shared -DCUDADLL_EXPORTS -o cuda_dll.dll cuda_dll.cu
Build the main app and link with the stub library:
nvcc -o cuda_main.exe cuda_main.cu cuda_dll.lib
Run the app:
main: calling wrapper()
wrapper: calling kernel()
kernel: n = 5
Don’t ask me how to do this with the MSVS IDE. I have never used the IDE, so I don’t know. You do not have to build the calling app from CUDA source code compiled with nvcc. You can also call the DLL from regular C++ code compiled with MSVC. Simply copy cuda_main.cu to cpp_main.cpp, then build the main app like so:
cl /Fecpp_main.exe cpp_main.cpp cuda_dll.lib