I am trying to use cufft callbacks in my code, which requires linking to the static cufft library. I can’t get my application to build. Unfortunately, I cannot share any code, but I will try my best to describe my setup and build process. When I compile by linking to -lcufft everything works fine.
I’m working on 64-bit Linux, with Cuda 10.1 and a V100.
I have a file called algorithm_gpu.cu. This file contains device and host code. It is the majority of my application and it also includes the cufft callbacks. I have a file called algorithm.cc that wraps the Cuda code. It contains the main function, it parses command line arguments, sets up some data structures and calls then passes them to the primary function in algorithm_gpu.cu
To compile, I use these steps:
nvcc -gencode arch=compute_70, code=sm_70 -std=c++11 -I<path> -dc algorithm_gpu.cu -o algorithm_gpu_device.o nvcc -gencode arch=compute_70, code=sm_70 -std=c++11 -I<path> -dlink algorithm_gpu_device.o -o algorithm_gpu_device_linked.o g++ -I<path> <compiler-flags> -c algorithm.cc -o algorithm.o g++ algorithm.o algorithm_gpu_device_linked.o algorithm_gpu_device.o -lcudart -lcufft_static -lculibos -o algorithm.exe
The last step produces hundreds, maybe thousands, of errors. Most of them say something like:
libcufft_static.a(dpVector2048D_cb.o): In function '__sti___cudaRegisterAll()':dpVector2048D_cb.compute_75.cudafe1.cpp(.text+0xfd): undefined reference to '__cudaRegisterLinkedBinary_35_dpVector2048D_cb_compute_75_cpp1_ii_baf75141'
I suspect I’m doing something wrong in my device compiling and linking, or I’m missing a library that I should be linking.
Any help is greatly appreciated.