how to generate DLL from cuda example code

I have successfully modified the simpleCUFFT.cu from the cuda example codes to fulfil my computation. I can use MS2017 to build exe file and run the exe file in CMD. But eventually I will call the functions in Labview. To do that I need a DLL file. now the question is how can I generate DLL file in MS2017? According to the discussion in stackoverflow thread : creating dll from cuda using nvcc, I played with the configuration type, but it still gives me exe file. And that exe file can not run in CMD. The reason I want to use MS 2017 instead of nvcc is that I basically have no experience in nvcc. And when I copy the command from MS2017 to nvcc command line to build the exe file, nothing is generated. So I personally prefer MS. Does anyone have experience on this? I highly appreciate.

There is a CUDA sample code that demonstrates creation and usage of a DLL with CUDA functionality in it:

https://docs.nvidia.com/cuda/cuda-samples/index.html#advanced-keyconcepts

It happens to use the driver API, not the runtime API, but that doesn’t appear to be the crux of your question, and it should be straightforward to convert it to use the runtime API if desired.

Here is an extremely simple example, using three source files: cuda_dll.cu contains the DLL source code, cuda_dll.h exports a simple C-style API, and cuda_main.cu calls the DLL. This example uses the CUDA runtime.

cuda_dll.h:

#ifdef CUDADLL_EXPORTS
#define DLLEXPORT __declspec(dllexport)
#else
#define DLLEXPORT __declspec(dllimport)
#endif

extern "C" DLLEXPORT void wrapper (int n);

cuda_dll.cu:

#include <stdio.h>
#include <stdlib.h>
#include "cuda_dll.h"

__global__ void kernel (int n)
{
    printf ("kernel: n = %d\n", n);
}

void wrapper (int n)
{
    printf ("wrapper: calling kernel()\n");
    kernel<<<1,1>>>(n);
    cudaDeviceSynchronize();
}

cuda_main.cu:

#include <stdio.h>
#include <stdlib.h>
#include "cuda_dll.h"

int main (void)
{
    printf ("main: calling wrapper()\n");
    wrapper (5);
    fflush (stdout);
    return EXIT_SUCCESS;
}

Build the DLL, which creates two files (as is the custom on Windows): actual shared library cuda_dll.dll and interface stub library cuda_dll.lib:

nvcc --shared -DCUDADLL_EXPORTS -o cuda_dll.dll cuda_dll.cu

Build the main app and link with the stub library:

nvcc -o cuda_main.exe cuda_main.cu cuda_dll.lib

Run the app:

main: calling wrapper()
wrapper: calling kernel()
kernel: n = 5

Don’t ask me how to do this with the MSVS IDE. I have never used the IDE, so I don’t know. You do not have to build the calling app from CUDA source code compiled with nvcc. You can also call the DLL from regular C++ code compiled with MSVC. Simply copy cuda_main.cu to cpp_main.cpp, then build the main app like so:

cl /Fecpp_main.exe cpp_main.cpp cuda_dll.lib