make sure your PATH specifies /usr/local/cuda-8.0/bin in front of /usr/bin and LD_LIBRARY_PATH includes /usr/local/cuda-8.0/lib64, otherwise your system will keep using the older nvcc 7.5
Then build for sm_50 with PTX embedded in the binary and rely on the built-in nVidia JIT compiler (in the CUDA driver) to translate the code for your architecture.
The following nvcc options build the machine code for sm_50 and sm_52 (all versions of Maxwell cards) and also includes the PTX for the compute 5.0 architecture so the JIT compiler can do its work for Pascal, Volta and Turing (and future cards).
Hi cbuchner1.
Sorry I can`t understand what you mean.
I write
set(CUDA_NVCC_FLAGS ${CUDA_NVCC_FLAGS} -D_FORCE_INLINES -O3 -lineinfo
-gencode=arch=compute_50,code=sm_50
-gencode=arch=compute_52,code=sm_52)
in CmakeLists.txt of my project.But there still is an error “invalid device function 8”
Whether I did something wrong? Could tell me how to do this? Thank you.
as follow:
set(CUDA_NVCC_FLAGS ${CUDA_NVCC_FLAGS} -D_FORCE_INLINES -O3 -lineinfo
-gencode=arch=compute_50,code=sm_50
-gencode=arch=compute_52,code=sm_52
-gencode=arch=compute_50,code=compute_50)
My project works fine.
Thanks!
my last cuda software release was made with cuda 7.5 (static linking) with -arch=sm_20 only, nothing else. I found this to have the best portability (meaning that it can run on most existing GPUs out-of-box) and speed (cuda 8/9 did something that clipped my code’s speed by 25%).
I don’t see any benefit specifying the CC versions compared to setting the lowest arch flag (in the case of cuda 7-8, it is sm_20, for cuda 9, it is sm_30). are there any benefit that I am not aware?