TL;DR : How do you create a shared library w/ NVCC with Windows 10 as your target platform
Right now I am at the stage where I know enough to be dangerous but not enough to know what I’m doing. I am attempting to create a shared library for use with cgo (https://golang.org/cmd/cgo/) which uses gcc. I am attempting to use a shared library having found this StackOverflow question https://stackoverflow.com/questions/32589153/how-to-compile-cuda-source-with-go-languages-cgo. I have been pouring over this topic this week and the following is what I think I should be doing:
I have 4 files: cGo.cu, cGo.cuh, cGo.h & cuda.go (all of which I have listed at bottom of this post).
cuda.go is my go source file which utilizes all cuda runtime api calls as well as my kernel wrapper. cGo.h is a C header file which is included with cuda.go so I have a declaration of my kernel wrapper. However you can’t compile cuda with cgo so I first have to make a library of my code.
So I create a shared library with cGo.cu & cGo.cuh. From what I surmise I think I should do it thusly:
nvcc -shared -o C:/Path/To/graph.so cuda.cu
This generates 3 files graph.lib, graph.exp, graph.so. I then link it’s location using cgo’s Linker Flags (which get passed to gcc) seen below. However when cgo (again via gcc) attempts to link my shared library it gives me this error:
C:\Users\user\AppData\Local\Temp\go-build022337846\cuda_graph_wrapper\graphCuda\_obj\cuda.cgo2.o: In function `_cgo_1d856146b359_Cfunc_kernel_kValid':
/tmp/go-build\cuda_graph_wrapper\graphCuda\_obj/cgo-gcc-prolog:306: undefined reference to `kernel_kValid'
So that doesn’t work, but I found this https://devtalk.nvidia.com/default/topic/395049/cuda-programming-and-performance/shared-library-creation-/post/2796786/#2796786 which has me generate a shared library like this:
C:\Users\user\Documents\Repos\cuda\Graph Algorithm>nvcc --shared -o lib\graph.so cGo.cu -IC:/Storage/Cuda/include -LC:/Storage/Cuda/lib -lcudart
This produces just graph.so , but now that there is no graph.lib file when I pass it to my linker it can’t find the library.
So I’m kinda stumped. Obviously the issue is my ignorance but I don’t know what to learn in order to fix my conundrum. Now I am starting to think that I have been trying to attempt the way to do this for a target platform of Linux; but I need Windows. So at this point any help would be greatly appreciated.
cGo.cu
#include "cGo.cuh"
#include "device_launch_parameters.h"
#include "cuda.h"
#include "cuda_runtime.h"
/*
kernel_kValid is a wrapper function for the CUDA Kernel to be called from cgo
*/
extern "C" void kernel_kValid(int blocks, int threads, ktype *kInfo, glob *values) {
kValid<<<blocks, threads>>>(kInfo, values);//execute the kernel
}
/*
kValid is the CUDA Kernel which is to be executed
*/
__global__ void kValid(ktype *kInfo, glob *values) {
//code
}
cGo.cuh
//For use with NVCC
typedef unsigned long int ktype;
typedef unsigned char glob;
/*
function Declarations
*/
extern "C" void kernel_kValid(int , int , ktype *, glob *);
/*
Kernel Declarations
*/
__global__ void kValid(ktype *, glob *);
cGo.h
//for use with cgo
typedef unsigned long int ktype;
typedef unsigned char glob;
/*
function Declarations
*/
void kernel_kValid(int , int , ktype *, glob *);
cuda.go
/*
#cgo LDFLAGS: -LC:/Storage/Cuda/lib/x64 -lcudart //C:/Storage/Cuda is a junction to my cuda installation
#cgo LDFLAGS: -L${SRCDIR}/lib -lgraph //path to my generated shared library graph.so
#cgo CFLAGS: -IC:/Storage/Cuda/include //Path to cuda headers
#cgo CFLAGS: -I${SRCDIR}/include //path to cGo.h
#include <cuda_runtime.h>
#include <stdlib.h>
#include "cGo.h"
*/
import "C"
//code
func myFunc(){
//...
C.kernel_kValid(C.int(B), C.int(T), unsafe.Pointer(storageDevice), unsafe.Pointer(globDevice)) //a wrapper around my kernel
cudaErr, err = C.cudaDeviceSynchronize() //a cuda runtime API call
//...
}