Generating Shared Library on Windows

TL;DR : How do you create a shared library w/ NVCC with Windows 10 as your target platform

Right now I am at the stage where I know enough to be dangerous but not enough to know what I’m doing. I am attempting to create a shared library for use with cgo (https://golang.org/cmd/cgo/) which uses gcc. I am attempting to use a shared library having found this StackOverflow question https://stackoverflow.com/questions/32589153/how-to-compile-cuda-source-with-go-languages-cgo. I have been pouring over this topic this week and the following is what I think I should be doing:

I have 4 files: cGo.cu, cGo.cuh, cGo.h & cuda.go (all of which I have listed at bottom of this post).

cuda.go is my go source file which utilizes all cuda runtime api calls as well as my kernel wrapper. cGo.h is a C header file which is included with cuda.go so I have a declaration of my kernel wrapper. However you can’t compile cuda with cgo so I first have to make a library of my code.

So I create a shared library with cGo.cu & cGo.cuh. From what I surmise I think I should do it thusly:

nvcc -shared -o C:/Path/To/graph.so cuda.cu

This generates 3 files graph.lib, graph.exp, graph.so. I then link it’s location using cgo’s Linker Flags (which get passed to gcc) seen below. However when cgo (again via gcc) attempts to link my shared library it gives me this error:

C:\Users\user\AppData\Local\Temp\go-build022337846\cuda_graph_wrapper\graphCuda\_obj\cuda.cgo2.o: In function `_cgo_1d856146b359_Cfunc_kernel_kValid':
/tmp/go-build\cuda_graph_wrapper\graphCuda\_obj/cgo-gcc-prolog:306: undefined reference to `kernel_kValid'

So that doesn’t work, but I found this https://devtalk.nvidia.com/default/topic/395049/cuda-programming-and-performance/shared-library-creation-/post/2796786/#2796786 which has me generate a shared library like this:

C:\Users\user\Documents\Repos\cuda\Graph Algorithm>nvcc --shared -o lib\graph.so cGo.cu -IC:/Storage/Cuda/include -LC:/Storage/Cuda/lib -lcudart

This produces just graph.so , but now that there is no graph.lib file when I pass it to my linker it can’t find the library.

So I’m kinda stumped. Obviously the issue is my ignorance but I don’t know what to learn in order to fix my conundrum. Now I am starting to think that I have been trying to attempt the way to do this for a target platform of Linux; but I need Windows. So at this point any help would be greatly appreciated.

cGo.cu

#include "cGo.cuh"
#include "device_launch_parameters.h"
#include "cuda.h"
#include "cuda_runtime.h"

/*
kernel_kValid is a wrapper function for the CUDA Kernel to be called from cgo
*/

extern "C" void kernel_kValid(int blocks, int threads, ktype *kInfo, glob *values) {
	kValid<<<blocks, threads>>>(kInfo, values);//execute the kernel
}
/*
kValid is the CUDA Kernel which is to be executed
*/
__global__ void kValid(ktype *kInfo, glob *values) {
	//code
}

cGo.cuh

//For use with NVCC

typedef unsigned long int ktype;
typedef unsigned char glob;

/*
function Declarations
*/
extern "C" void kernel_kValid(int , int , ktype *, glob *);
/*
Kernel Declarations
*/
__global__ void kValid(ktype *, glob *);

cGo.h

//for use with cgo

typedef unsigned long int ktype;
typedef unsigned char glob;

/*
function Declarations
*/

void kernel_kValid(int , int , ktype *, glob *);

cuda.go

/*
#cgo LDFLAGS: -LC:/Storage/Cuda/lib/x64 -lcudart //C:/Storage/Cuda is a junction to my cuda installation
#cgo LDFLAGS: -L${SRCDIR}/lib -lgraph //path to my generated shared library graph.so

#cgo CFLAGS: -IC:/Storage/Cuda/include //Path to cuda headers
#cgo CFLAGS: -I${SRCDIR}/include //path to cGo.h

#include <cuda_runtime.h>
#include <stdlib.h>
#include "cGo.h"
*/
import "C"
//code
func myFunc(){
//...
C.kernel_kValid(C.int(B), C.int(T), unsafe.Pointer(storageDevice), unsafe.Pointer(globDevice)) //a wrapper around my kernel
cudaErr, err = C.cudaDeviceSynchronize() //a cuda runtime API call
//...
}

Did you ever get this to work?
I am using /unixpickle/cuda on Linux, but I really would like to get this to run in windows, too.
I started poking around in windows putting new environmental variables, but then I wasn’t really sure what goes where when it comes to windows. Below are the paths that have to be included to get it to work in Linux.

export CUDA_PATH=/usr/local/cuda
export CPATH="$CUDA_PATH/include/"
export CGO_LDFLAGS="$CUDA_PATH/lib64/libcublas.so $CUDA_PATH/lib64/libcudart.so $CUDA_PATH/lib64/stubs/libcuda.so $CUDA_PATH/lib64/libcurand.so"
export LD_LIBRARY_PATH=$CUDA_PATH/lib64/