The following error occurs when I try to test on the CUDA installation samples which come with the installation (e.g. vectorAdd_nvrtc):
clang++ -I../../common/inc -I/Developer/NVIDIA/CUDA-9.2/include -arch x86_64 -o vectorAdd.o -c vectorAdd.cpp
clang++ -rpath /Developer/NVIDIA/CUDA-9.2/lib -o vectorAdd_nvrtc vectorAdd.o -L/Developer/NVIDIA/CUDA-9.2/lib -lnvrtc
Undefined symbols for architecture x86_64:
"_cuCtxCreate_v2", referenced from:
loadPTX(char*, int, char**) in vectorAdd.o
"_cuCtxSynchronize", referenced from:
_main in vectorAdd.o
"_cuDeviceGet", referenced from:
loadPTX(char*, int, char**) in vectorAdd.o
findCudaDeviceDRV(int, char const**) in vectorAdd.o
gpuDeviceInitDRV(int, char const**) in vectorAdd.o
"_cuDeviceGetAttribute", referenced from:
loadPTX(char*, int, char**) in vectorAdd.o
gpuGetMaxGflopsDeviceIdDRV() in vectorAdd.o
void getCudaAttribute<int>(int*, CUdevice_attribute_enum, int) in vectorAdd.o
"_cuDeviceGetCount", referenced from:
gpuDeviceInitDRV(int, char const**) in vectorAdd.o
gpuGetMaxGflopsDeviceIdDRV() in vectorAdd.o
"_cuDeviceGetName", referenced from:
loadPTX(char*, int, char**) in vectorAdd.o
findCudaDeviceDRV(int, char const**) in vectorAdd.o
gpuDeviceInitDRV(int, char const**) in vectorAdd.o
"_cuInit", referenced from:
loadPTX(char*, int, char**) in vectorAdd.o
gpuDeviceInitDRV(int, char const**) in vectorAdd.o
gpuGetMaxGflopsDeviceIdDRV() in vectorAdd.o
"_cuLaunchKernel", referenced from:
_main in vectorAdd.o
"_cuMemAlloc_v2", referenced from:
_main in vectorAdd.o
"_cuMemFree_v2", referenced from:
_main in vectorAdd.o
"_cuMemcpyDtoH_v2", referenced from:
_main in vectorAdd.o
"_cuMemcpyHtoD_v2", referenced from:
_main in vectorAdd.o
"_cuModuleGetFunction", referenced from:
_main in vectorAdd.o
"_cuModuleLoadDataEx", referenced from:
loadPTX(char*, int, char**) in vectorAdd.o
ld: symbol(s) not found for architecture x86_64
clang: error: linker command failed with exit code 1 (use -v to see invocation)
make: *** [vectorAdd_nvrtc] Error 1
However, this only happens when the runtime lib nvrtc is utilized, i.e. vectorAdd will build just fine. I tried with several different samples (e.g. clock vs clock_nvrtc, etc.) and obversed the same outcome.
I researched a little online and it seems like to be related to some header file include problem, but I am not very familiar with c++ and have no idea how to fix this. I do see that in the Makefile, it states that nvrtc is not supported on ARMv7 systems, but I do not think that is the issue since my architecture is x86_64? Any help is greatly appreciated!
(CUDA is of version 9.2. I am running on MAC OS 10.13.6, with Xcode 9.2, Apple LLVM version 9.0.0.)