Ubuntu 17.04, CUDA 8.0: Linker problems with CUDA examples (dynamic parallelism)

Hi!

After manually installing CUDA (see below) I have trouble linking programs that use dynamic parallelism, i. e. from the CUDA examples I can build all except the following:

  • 0_Simple/cdpSimpleQuicksort
  • 0_Simple/cdpSimplePrint
  • 3_Imaging/cudaDecodeGL
  • 6_Advanced/cdpAdvancedQuicksort
  • 6_Advanced/cdpBezierTessellation
  • 6_Advanced/cdpLUDecomposition
  • 6_Advanced/cdpQuadtree
  • 7_CUDALibraries/simpleDevLibCUBLAS
  • 7_CUDALibraries/simpleCUFFT_callback

The errors are as follows:

user@desktop:~/NVIDIA_CUDA-8.0_Samples$ make -k |grep error
nvlink error   : Undefined reference to 'cudaStreamCreateWithFlags' in 'cdpSimpleQuicksort.o' (target: sm_35)
nvlink error   : Undefined reference to 'cudaGetParameterBufferV2' in 'cdpSimpleQuicksort.o' (target: sm_35)
nvlink error   : Undefined reference to 'cudaLaunchDeviceV2' in 'cdpSimpleQuicksort.o' (target: sm_35)
nvlink error   : Undefined reference to 'cudaStreamDestroy' in 'cdpSimpleQuicksort.o' (target: sm_35)
make[1]: *** [cdpSimpleQuicksort] Error 255
make[1]: Target 'all' not remade because of errors.
make: *** [0_Simple/cdpSimpleQuicksort/Makefile.ph_build] Error 2
nvlink error   : Undefined reference to 'cudaGetParameterBufferV2' in 'cdpSimplePrint.o' (target: sm_35)
nvlink error   : Undefined reference to 'cudaLaunchDeviceV2' in 'cdpSimplePrint.o' (target: sm_35)
make[1]: *** [cdpSimplePrint] Error 255
make[1]: Target 'all' not remade because of errors.
make: *** [0_Simple/cdpSimplePrint/Makefile.ph_build] Error 2
nvcc warning : The 'compute_20', 'sm_20', and 'sm_21' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
/usr/bin/ld: cannot find -lnvcuvid
collect2: error: ld returned 1 exit status
make[1]: *** [cudaDecodeGL] Error 1
make[1]: Target 'all' not remade because of errors.
make: *** [3_Imaging/cudaDecodeGL/Makefile.ph_build] Error 2
nvlink error   : Undefined reference to 'cudaStreamCreateWithFlags' in 'cdpAdvancedQuicksort.o' (target: sm_35)
nvlink error   : Undefined reference to 'cudaGetParameterBufferV2' in 'cdpAdvancedQuicksort.o' (target: sm_35)
nvlink error   : Undefined reference to 'cudaLaunchDeviceV2' in 'cdpAdvancedQuicksort.o' (target: sm_35)
nvlink error   : Undefined reference to 'cudaMemcpyAsync' in 'cdpAdvancedQuicksort.o' (target: sm_35)
nvlink error   : Undefined reference to 'cudaPeekAtLastError' in 'cdpAdvancedQuicksort.o' (target: sm_35)
nvlink error   : Undefined reference to 'cudaGetLastError' in 'cdpAdvancedQuicksort.o' (target: sm_35)
nvlink error   : Undefined reference to 'cudaGetErrorString' in 'cdpAdvancedQuicksort.o' (target: sm_35)
make[1]: *** [cdpAdvancedQuicksort] Error 255
make[1]: Target 'all' not remade because of errors.
make: *** [6_Advanced/cdpAdvancedQuicksort/Makefile.ph_build] Error 2
nvlink error   : Undefined reference to 'cudaGetParameterBufferV2' in 'BezierLineCDP.o' (target: sm_35)
nvlink error   : Undefined reference to 'cudaLaunchDeviceV2' in 'BezierLineCDP.o' (target: sm_35)
nvlink error   : Undefined reference to 'cudaFree' in 'BezierLineCDP.o' (target: sm_35)
make[1]: *** [cdpBezierTessellation] Error 255
make[1]: Target 'all' not remade because of errors.
make: *** [6_Advanced/cdpBezierTessellation/Makefile.ph_build] Error 2
nvlink error   : Undefined reference to 'cublasIdamax_v2' in 'dgetf2.o' (target: sm_35)
nvlink error   : Undefined reference to 'cublasDswap_v2' in 'dgetf2.o' (target: sm_35)
nvlink error   : Undefined reference to 'cublasDscal_v2' in 'dgetf2.o' (target: sm_35)
nvlink error   : Undefined reference to 'cublasDger_v2' in 'dgetf2.o' (target: sm_35)
nvlink error   : Undefined reference to 'cudaGetParameterBufferV2' in 'dgetrf.o' (target: sm_35)
nvlink error   : Undefined reference to 'cudaLaunchDeviceV2' in 'dgetrf.o' (target: sm_35)
nvlink error   : Undefined reference to 'cublasDtrsm_v2' in 'dgetrf.o' (target: sm_35)
nvlink error   : Undefined reference to 'cublasDgemm_v2' in 'dgetrf.o' (target: sm_35)
nvlink error   : Undefined reference to 'cublasCreate_v2' in 'dgetrf.o' (target: sm_35)
nvlink error   : Undefined reference to 'cublasSetPointerMode_v2' in 'dgetrf.o' (target: sm_35)
nvlink error   : Undefined reference to 'cudaStreamCreateWithFlags' in 'dgetrf.o' (target: sm_35)
nvlink error   : Undefined reference to 'cublasSetStream_v2' in 'dgetrf.o' (target: sm_35)
make[1]: *** [cdpLUDecomposition] Error 255
make[1]: Target 'all' not remade because of errors.
make: *** [6_Advanced/cdpLUDecomposition/Makefile.ph_build] Error 2
nvlink error   : Undefined reference to 'cudaGetParameterBufferV2' in 'cdpQuadtree.o' (target: sm_35)
nvlink error   : Undefined reference to 'cudaLaunchDeviceV2' in 'cdpQuadtree.o' (target: sm_35)
make[1]: *** [cdpQuadtree] Error 255
make[1]: Target 'all' not remade because of errors.
make: *** [6_Advanced/cdpQuadtree/Makefile.ph_build] Error 2
nvcc warning : The 'compute_20', 'sm_20', and 'sm_21' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
nvcc warning : The 'compute_20', 'sm_20', and 'sm_21' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
nvcc warning : The 'compute_20', 'sm_20', and 'sm_21' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
nvcc warning : The 'compute_20', 'sm_20', and 'sm_21' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
nvcc warning : The 'compute_20', 'sm_20', and 'sm_21' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
nvlink error   : Undefined reference to 'cublasCreate_v2' in 'kernels.o' (target: sm_35)
nvlink error   : Undefined reference to 'cublasSgemm_v2' in 'kernels.o' (target: sm_35)
nvlink error   : Undefined reference to 'cublasDestroy_v2' in 'kernels.o' (target: sm_35)
make[1]: *** [simpleDevLibCUBLAS] Error 255
make[1]: Target 'all' not remade because of errors.
make: *** [7_CUDALibraries/simpleDevLibCUBLAS/Makefile.ph_build] Error 2
nvcc warning : The 'compute_20', 'sm_20', and 'sm_21' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
nvcc warning : The 'compute_20', 'sm_20', and 'sm_21' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
nvlink warning : Function '_Z27ComplexPointwiseMulAndScalePvmS_S_' has address taken but no possible call to it (target: sm_20)
nvlink warning : Function '_Z27ComplexPointwiseMulAndScalePvmS_S_' has address taken but no possible call to it (target: sm_30)
nvlink warning : Function '_Z27ComplexPointwiseMulAndScalePvmS_S_' has address taken but no possible call to it (target: sm_35)
nvlink warning : Function '_Z27ComplexPointwiseMulAndScalePvmS_S_' has address taken but no possible call to it (target: sm_50)
nvlink warning : Function '_Z27ComplexPointwiseMulAndScalePvmS_S_' has address taken but no possible call to it (target: sm_60)
/usr/lib/gcc/x86_64-linux-gnu/5/../../../x86_64-linux-gnu/libcufft_static.a(fft_dimension_class_multi.o): In function `__sti____cudaRegisterAll_72_tmpxft_000015c4_00000000_15_fft_dimension_class_multi_compute_60_cpp1_ii_466e44ab()':
tmpxft_000015c4_00000000-4_fft_dimension_class_multi.compute_60.cudafe1.cpp:(.text+0xd2d): undefined reference to `__cudaRegisterLinkedBinary_72_tmpxft_000015c4_00000000_15_fft_dimension_class_multi_compute_60_cpp1_ii_466e44ab'
/usr/lib/gcc/x86_64-linux-gnu/5/../../../x86_64-linux-gnu/libcufft_static.a(fft_dimension_class_multi.o): In function `global constructors keyed to BaseListMulti::radices':
tmpxft_000015c4_00000000-4_fft_dimension_class_multi.compute_60.cudafe1.cpp:(.text+0x1aed): undefined reference to `__cudaRegisterLinkedBinary_72_tmpxft_000015c4_00000000_15_fft_dimension_class_multi_compute_60_cpp1_ii_466e44ab'
/usr/lib/gcc/x86_64-linux-gnu/5/../../../x86_64-linux-gnu/libcufft_static.a(database_cb.o): In function `__sti____cudaRegisterAll_58_tmpxft_0000437a_00000000_15_database_cb_compute_60_cpp1_ii_e424e52f()':
tmpxft_0000437a_00000000-4_database_cb.compute_60.cudafe1.cpp:(.text+0x8cd): undefined reference to `__cudaRegisterLinkedBinary_58_tmpxft_0000437a_00000000_15_database_cb_compute_60_cpp1_ii_e424e52f'
/usr/lib/gcc/x86_64-linux-gnu/5/../../../x86_64-linux-gnu/libcufft_static.a(database_cb.o): In function `global constructors keyed to make::fft1d(int, int, int, fftPrecision_t, int, int, kernel_db::description::layout_t)':

... SNIPP ...

collect2: error: ld returned 1 exit status
make[1]: *** [simpleCUFFT_callback] Error 1
make[1]: Target 'all' not remade because of errors.
make: *** [7_CUDALibraries/simpleCUFFT_callback/Makefile.ph_build] Error 2
make: Target 'all' not remade because of errors.

I have an Ubuntu 17.04 (64 bit) running with nVidia driver 381.22 and installed CUDA 8.0.44, cuDNN 6.0.21. Because there is is no official support for 17.04 yet and the driver contained in 8.0.44 is not suitable for my GTX1080Ti I tried to install as follows:

sudo apt-get install nvidia-cuda-toolkit

.

Then “simulate” the usual folder structure using symlinks:

user@desktop:/usr/local/cuda$ ls -la
total 12
drwxr-xr-x  3 root root 4096 May 27 22:54 .
drwxr-xr-x 11 root root 4096 May 28 00:18 ..
lrwxrwxrwx  1 root root   32 May 13 18:41 bin -> /usr/lib/nvidia-cuda-toolkit/bin
drwxr-xr-x  3 root root 4096 May 26 15:40 extras
lrwxrwxrwx  1 root root   12 May 13 18:42 include -> /usr/include
lrwxrwxrwx  1 root root   25 May 13 18:41 lib64 -> /usr/lib/x86_64-linux-gnu
lrwxrwxrwx  1 root root   38 May 13 18:41 libdevice -> /usr/lib/nvidia-cuda-toolkit/libdevice
lrwxrwxrwx  1 root root   38 May 26 15:33 nvvm -> /usr/lib/nvidia-cuda-toolkit/libdevice
user@desktop:/usr/local/cuda$ ls -la extras/CUPTI/
total 8
drwxr-xr-x 2 root root 4096 May 26 15:50 .
drwxr-xr-x 3 root root 4096 May 26 15:40 ..
lrwxrwxrwx 1 root root   12 May 26 15:40 include -> /usr/include
lrwxrwxrwx 1 root root   26 May 26 15:50 lib64 -> /usr/lib/x86_64-linux-gnu/
user@desktop:/usr/local/cuda$

The header and libraries for cuDNN were copied to

/usr/include

and

/usr/lib/x86_64-linux-gnu

respectively.

In addition I added this to my (non-root) home .bashrc:

export CUDA_HOME=/usr/local/cuda
export LD_LIBRARY_PATH=${CUDA_HOME}/lib64:${LD_LIBRARY_PATH}
PATH=${CUDA_HOME}/bin:${PATH}
export PATH

The listings of the relevant files:

user@desktop:~/NVIDIA_CUDA-8.0_Samples$ which nvcc
/usr/local/cuda/bin/nvcc
user@desktop:~/NVIDIA_CUDA-8.0_Samples$ which nvlink
/usr/bin/nvlink
user@desktop:~/NVIDIA_CUDA-8.0_Samples$ which gcc
/usr/local/cuda/bin/gcc
user@desktop:~/NVIDIA_CUDA-8.0_Samples$ which g++
/usr/local/cuda/bin/g++
user@desktop:/usr/lib/nvidia-cuda-toolkit$ ls -la
total 16
drwxr-xr-x   4 root root 4096 May 13 16:25 .
drwxr-xr-x 156 root root 4096 May 26 11:16 ..
drwxr-xr-x   3 root root 4096 May 26 15:56 bin
drwxr-xr-x   2 root root 4096 May 13 16:25 libdevice
user@desktop:/usr/lib/nvidia-cuda-toolkit$ ls -la bin
total 20908
drwxr-xr-x 3 root root     4096 May 26 15:56 .
drwxr-xr-x 4 root root     4096 May 13 16:25 ..
-rwxr-xr-x 1 root root  9913472 Sep 14  2016 cicc
drwxr-xr-x 2 root root     4096 May 13 16:25 crt
-rwxr-xr-x 1 root root      711 Jan 20 20:25 g++
-rwxr-xr-x 1 root root      711 Jan 20 19:48 gcc
lrwxrwxrwx 1 root root       40 Jan 20 20:25 libcuinj64.so -> ../../x86_64-linux-gnu/libcuinj64.so.8.0
-rwxr-xr-x 1 root root   245680 Sep 14  2016 nvcc
lrwxrwxrwx 1 root root       17 Jan 20 20:25 nvcc.profile -> /etc/nvcc.profile
-rwxr-xr-x 1 root root 11226584 Sep 14  2016 nvprof
user@desktop:/usr/lib/nvidia-cuda-toolkit$ ls -la libdevice/
total 1652
drwxr-xr-x 2 root root   4096 May 13 16:25 .
drwxr-xr-x 4 root root   4096 May 13 16:25 ..
-rw-r--r-- 1 root root 415644 Sep 14  2016 libdevice.compute_20.10.bc
-rw-r--r-- 1 root root 418144 Sep 14  2016 libdevice.compute_30.10.bc
-rw-r--r-- 1 root root 419136 Sep 14  2016 libdevice.compute_35.10.bc
-rw-r--r-- 1 root root 418524 Sep 14  2016 libdevice.compute_50.10.bc
user@desktop:/usr/include$ ls -la | grep cu
-rw-r--r--  1 root root   196582 Sep 14  2016 cublas_api.h
-rw-r--r--  1 root root    33423 Sep 14  2016 cublas.h
-rw-r--r--  1 root root    10263 Sep 14  2016 cublas_v2.h
-rw-r--r--  1 root root    41048 Jan 20 20:25 cublasXt.h
-rw-r--r--  1 root root    11896 Sep 14  2016 cuComplex.h
-rw-r--r--  1 root root    13858 Sep 14  2016 cuda_device_runtime_api.h
-rw-r--r--  1 root root   114479 Sep 14  2016 cuda_fp16.h
drwxr-xr-x  2 root root     4096 May 13 16:24 cuda-gdb
-rw-r--r--  1 root root    23248 Sep 14  2016 cudaGL.h
-rw-r--r--  1 root root    18437 Sep 14  2016 cuda_gl_interop.h
-rw-r--r--  1 root root   465287 Jan 20 20:25 cuda.h
-rw-r--r--  1 root root     4105 Sep 14  2016 cudalibxt.h
-rw-r--r--  1 root root    49195 Sep 14  2016 cuda_occupancy.h
-rw-r--r--  1 root root     6040 Sep 14  2016 cuda_profiler_api.h
-rw-r--r--  1 root root     6076 Sep 14  2016 cudaProfiler.h
-rw-r--r--  1 root root   312920 Jan 20 20:25 cuda_runtime_api.h
-rw-r--r--  1 root root    84176 Jan 20 20:25 cuda_runtime.h
-rw-r--r--  1 root root     4683 Sep 14  2016 cuda_stdint.h
-rw-r--r--  1 root root     4352 Sep 14  2016 cuda_surface_types.h
-rw-r--r--  1 root root     4857 Sep 14  2016 cuda_texture_types.h
-rw-r--r--  1 root root    13280 Sep 14  2016 cudaVDPAU.h
-rw-r--r--  1 root root     7543 Sep 14  2016 cuda_vdpau_interop.h
-r--r--r--  1 root root    98782 May 26 20:25 cudnn.h
-rw-r--r--  1 root root    12925 Sep 14  2016 cufft.h
-rw-r--r--  1 root root    18254 Sep 14  2016 cufftw.h
-rw-r--r--  1 root root    10812 Sep 14  2016 cufftXt.h
-rw-r--r--  1 root root   163962 Sep 14  2016 cupti_activity.h
-rw-r--r--  1 root root    23057 Sep 14  2016 cupti_callbacks.h
-rw-r--r--  1 root root    46042 Sep 14  2016 cupti_driver_cbid.h
-rw-r--r--  1 root root    49753 Sep 14  2016 cupti_events.h
-rw-r--r--  1 root root     4697 Sep 14  2016 cupti.h
-rw-r--r--  1 root root    31377 Sep 14  2016 cupti_metrics.h
-rw-r--r--  1 root root     5732 Sep 14  2016 cupti_nvtx_cbid.h
-rw-r--r--  1 root root     8127 Sep 14  2016 cupti_result.h
-rw-r--r--  1 root root    26934 Sep 14  2016 cupti_runtime_cbid.h
-rw-r--r--  1 root root     3869 Sep 14  2016 cupti_version.h
-rw-r--r--  1 root root    10836 Sep 14  2016 curand_discrete2.h
-rw-r--r--  1 root root     3486 Sep 14  2016 curand_discrete.h
-rw-r--r--  1 root root     3717 Sep 14  2016 curand_globals.h
-rw-r--r--  1 root root    42662 Sep 14  2016 curand.h
-rw-r--r--  1 root root    48013 Sep 14  2016 curand_kernel.h
-rw-r--r--  1 root root    28122 Sep 14  2016 curand_lognormal.h
-rw-r--r--  1 root root   168895 Sep 14  2016 curand_mrg32k3a.h
-rw-r--r--  1 root root   276660 Sep 14  2016 curand_mtgp32dc_p_11213.h
-rw-r--r--  1 root root     7921 Sep 14  2016 curand_mtgp32.h
-rw-r--r--  1 root root    17971 Sep 14  2016 curand_mtgp32_host.h
-rw-r--r--  1 root root    13819 Sep 14  2016 curand_mtgp32_kernel.h
-rw-r--r--  1 root root    26911 Sep 14  2016 curand_normal.h
-rw-r--r--  1 root root     4649 Sep 14  2016 curand_normal_static.h
-rw-r--r--  1 root root     7162 Sep 14  2016 curand_philox4x32_x.h
-rw-r--r--  1 root root    29829 Sep 14  2016 curand_poisson.h
-rw-r--r--  1 root root   333853 Sep 14  2016 curand_precalc.h
-rw-r--r--  1 root root    17346 Sep 14  2016 curand_uniform.h
-rw-r--r--  1 root root     3672 Sep 14  2016 cusolver_common.h
-rw-r--r--  1 root root    32907 Sep 14  2016 cusolverDn.h
-rw-r--r--  1 root root    17455 Sep 14  2016 cusolverRf.h
-rw-r--r--  1 root root    22184 Sep 14  2016 cusolverSp.h
-rw-r--r--  1 root root    27576 Sep 14  2016 cusolverSp_LOWLEVEL_PREVIEW.h
-rw-r--r--  1 root root   350772 Sep 14  2016 cusparse.h
-rw-r--r--  1 root root     2589 Sep 14  2016 cusparse_v2.h
-rw-r--r--  1 root root    25515 Sep 14  2016 cuviddec.h
-rw-r--r--  1 root root     2416 Sep 14  2016 generated_cuda_gl_interop_meta.h
-rw-r--r--  1 root root     3115 Sep 14  2016 generated_cudaGL_meta.h
-rw-r--r--  1 root root    46062 Sep 14  2016 generated_cuda_meta.h
-rw-r--r--  1 root root    33271 Sep 14  2016 generated_cuda_runtime_api_meta.h
-rw-r--r--  1 root root     1369 Sep 14  2016 generated_cuda_vdpau_interop_meta.h
-rw-r--r--  1 root root     1453 Sep 14  2016 generated_cudaVDPAU_meta.h
-rw-r--r--  1 root root    10124 Sep 14  2016 nvcuvid.h
user@desktop:/usr/lib/x86_64-linux-gnu$ ls -la | grep libcu
-rw-r--r--   1 root root  53603800 Sep 14  2016 libcublas_device.a
lrwxrwxrwx   1 root root        16 Jan 20 20:25 libcublas.so -> libcublas.so.8.0
lrwxrwxrwx   1 root root        19 Jan 20 20:25 libcublas.so.8.0 -> libcublas.so.8.0.45
-rw-r--r--   1 root root  41556032 Sep 14  2016 libcublas.so.8.0.45
-rw-r--r--   1 root root  48013344 Sep 14  2016 libcublas_static.a
-rw-r--r--   1 root root    558720 Sep 14  2016 libcudadevrt.a
lrwxrwxrwx   1 root root        16 Jan 20 20:25 libcudart.so -> libcudart.so.8.0
lrwxrwxrwx   1 root root        19 Jan 20 20:25 libcudart.so.8.0 -> libcudart.so.8.0.44
-rw-r--r--   1 root root    415432 Sep 14  2016 libcudart.so.8.0.44
-rw-r--r--   1 root root    775162 Sep 14  2016 libcudart_static.a
lrwxrwxrwx   1 root root        12 May  9 23:00 libcuda.so -> libcuda.so.1
lrwxrwxrwx   1 root root        17 May  9 23:00 libcuda.so.1 -> libcuda.so.381.22
-rw-r--r--   1 root root   8679656 May  4 09:28 libcuda.so.381.22
-rwxr-xr-x   1 root root 154322864 May 26 20:27 libcudnn.so
-rwxr-xr-x   1 root root 154322864 May 26 20:27 libcudnn.so.6
-rwxr-xr-x   1 root root 154322864 May 26 20:27 libcudnn.so.6.0.21
-rw-r--r--   1 root root 143843808 May 26 20:27 libcudnn_static.a
lrwxrwxrwx   1 root root        15 Jan 20 20:25 libcufft.so -> libcufft.so.8.0
lrwxrwxrwx   1 root root        18 Jan 20 20:25 libcufft.so.8.0 -> libcufft.so.8.0.44
-rw-r--r--   1 root root 146761440 Sep 14  2016 libcufft.so.8.0.44
-rw-r--r--   1 root root 129659122 Sep 14  2016 libcufft_static.a
lrwxrwxrwx   1 root root        16 Jan 20 20:25 libcufftw.so -> libcufftw.so.8.0
lrwxrwxrwx   1 root root        19 Jan 20 20:25 libcufftw.so.8.0 -> libcufftw.so.8.0.44
-rw-r--r--   1 root root    476840 Sep 14  2016 libcufftw.so.8.0.44
-rw-r--r--   1 root root     42294 Sep 14  2016 libcufftw_static.a
lrwxrwxrwx   1 root root        17 Jan 20 20:25 libcuinj64.so -> libcuinj64.so.8.0
lrwxrwxrwx   1 root root        20 Jan 20 20:25 libcuinj64.so.8.0 -> libcuinj64.so.8.0.44
-rw-r--r--   1 root root   6401960 Sep 14  2016 libcuinj64.so.8.0.44
-rw-r--r--   1 root root   1649302 Sep 14  2016 libculibos.a
-rw-r--r--   1 root root     55816 Mar 23 18:55 libcupscgi.so.1
lrwxrwxrwx   1 root root        23 May 13 15:16 libcupsfilters.so.1 -> libcupsfilters.so.1.0.0
-rw-r--r--   1 root root    182384 Mar 21 19:36 libcupsfilters.so.1.0.0
-rw-r--r--   1 root root     34656 Mar 23 18:55 libcupsimage.so.2
-rw-r--r--   1 root root     26464 Mar 23 18:55 libcupsmime.so.1
-rw-r--r--   1 root root    116816 Mar 23 18:55 libcupsppdc.so.1
-rw-r--r--   1 root root    554960 Mar 23 18:55 libcups.so.2
lrwxrwxrwx   1 root root        15 Jan 20 20:25 libcupti.so -> libcupti.so.8.0
lrwxrwxrwx   1 root root        18 Jan 20 20:25 libcupti.so.8.0 -> libcupti.so.8.0.44
-rw-r--r--   1 root root   4804072 Sep 14  2016 libcupti.so.8.0.44
lrwxrwxrwx   1 root root        16 Jan 20 20:25 libcurand.so -> libcurand.so.8.0
lrwxrwxrwx   1 root root        19 Jan 20 20:25 libcurand.so.8.0 -> libcurand.so.8.0.44
-rw-r--r--   1 root root  59110720 Sep 14  2016 libcurand.so.8.0.44
-rw-r--r--   1 root root  59305156 Sep 14  2016 libcurand_static.a
lrwxrwxrwx   1 root root        19 Apr 17 22:20 libcurl-gnutls.so.3 -> libcurl-gnutls.so.4
lrwxrwxrwx   1 root root        23 Apr 17 22:20 libcurl-gnutls.so.4 -> libcurl-gnutls.so.4.4.0
-rw-r--r--   1 root root    465280 Apr 17 22:20 libcurl-gnutls.so.4.4.0
lrwxrwxrwx   1 root root        12 Apr 17 22:20 libcurl.so.3 -> libcurl.so.4
lrwxrwxrwx   1 root root        16 Apr 17 22:20 libcurl.so.4 -> libcurl.so.4.4.0
-rw-r--r--   1 root root    473472 Apr 17 22:20 libcurl.so.4.4.0
lrwxrwxrwx   1 root root        18 Jan 20 20:25 libcusolver.so -> libcusolver.so.8.0
lrwxrwxrwx   1 root root        21 Jan 20 20:25 libcusolver.so.8.0 -> libcusolver.so.8.0.44
-rw-r--r--   1 root root  53874104 Sep 14  2016 libcusolver.so.8.0.44
-rw-r--r--   1 root root  22385324 Sep 14  2016 libcusolver_static.a
lrwxrwxrwx   1 root root        18 Jan 20 20:25 libcusparse.so -> libcusparse.so.8.0
lrwxrwxrwx   1 root root        21 Jan 20 20:25 libcusparse.so.8.0 -> libcusparse.so.8.0.44
-rw-r--r--   1 root root  42980872 Sep 14  2016 libcusparse.so.8.0.44
-rw-r--r--   1 root root  51597424 Sep 14  2016 libcusparse_static.a

From the official CUDA 8.0 runfile I extracted the samples to my home folder and tried to build with make (or sudo make).

Despite the errors above, I was able to build tensorflow from sources with CUDA support. But I would like to understand why some parts are not working and fix them accordingly to have a proper configuration. Any ideas to help me fix this issue? Thanks!

Some update: I was able to build 3_Imaging/cudaDecodeGL after adding a symlink from /usr/lib/nvidia-381/libnvcuvid.so to /usr/lib/x86_64-linux-gnu/libnvcuvid.so.

I also tried chmod ugo+x to all libraries, but I still have the linker errors regarding dynamic parallelism, cufft and cublas.

export CUDA_HOME=/usr/

Thanks, but setting

CUDA_HOME=/usr/

instead of

CUDA_HOME=/usr/local/cuda

didn’t improve or change anything at all.