CUDA 10.1 - tensorflow build configuration fails

Hello,

I want to build tensorflow 1.14 using CUDA 10.1.168 and CUDNN 7.6.1.34 on Manjaro. But when launching ./configure, it fails when I type CUDA’s version. I have been through :

sudo pacman -S cuda cudnn
./bazel_0.24.1_installer_linux_x86_64.sh --user
export PATH='$PATH:$HOME/bin'
git clone https://github.com/tensorflow/tensorflow.git
cd tensorflow
git checkout r1.14
./configure

and then I got the following :

WARNING: --batch mode is deprecated. Please instead explicitly shut down your Bazel server using the command "bazel shutdown".
You have bazel 0.24.1 installed.
Please specify the location of python. [Default is /usr/bin/python]: 

Found possible Python library paths:
  /usr/lib/python3.7/site-packages
Please input the desired Python library path to use.  Default is [/usr/lib/python3.7/site-packages]

Do you wish to build TensorFlow with XLA JIT support? [Y/n]: 
XLA JIT support will be enabled for TensorFlow.

Do you wish to build TensorFlow with OpenCL SYCL support? [y/N]: 
No OpenCL SYCL support will be enabled for TensorFlow.

Do you wish to build TensorFlow with ROCm support? [y/N]: 
No ROCm support will be enabled for TensorFlow.

Do you wish to build TensorFlow with CUDA support? [y/N]: y
CUDA support will be enabled for TensorFlow.

Do you wish to build TensorFlow with TensorRT support? [y/N]: 
No TensorRT support will be enabled for TensorFlow.

Could not find any cuda.h matching version '' in any subdirectory:
        ''
        'include'
        'include/cuda'
        'include/*-linux-gnu'
        'extras/CUPTI/include'
        'include/cuda/CUPTI'
of:
        '/opt/cuda/extras/CUPTI/lib64'
        '/opt/cuda/lib64'
        '/opt/cuda/nvvm/lib64'
        '/usr'
        '/usr/lib'
        '/usr/lib/libfakeroot'
        '/usr/lib32'
Asking for detailed CUDA configuration...

Please specify the CUDA SDK version you want to use. [Leave empty to default to CUDA 10]: 10.1.168

Please specify the cuDNN version you want to use. [Leave empty to default to cuDNN 7]: 7.6.1.34

Please specify the locally installed NCCL version you want to use. [Leave empty to use http://github.com/nvidia/nccl]: 1.3

Please specify the comma-separated list of base paths to look for CUDA libraries and headers. [Leave empty to use the default]: 5.0

Could not find any cuda.h matching version '10.1.168' in any subdirectory:
        ''
        'include'
        'include/cuda'
        'include/*-linux-gnu'
        'extras/CUPTI/include'
        'include/cuda/CUPTI'
of:
Asking for detailed CUDA configuration...

Please specify the CUDA SDK version you want to use. [Leave empty to default to CUDA 10]:

So, I cannot configure the build. I tried to type 10.1 istead of 10.1.168, or event choose the default option. Is there something I have to do before ?

Hi, I have solved it.

find -name cuda.h

after you find it, paste it on Please specify the comma-separated list of base paths to look for CUDA libraries and headers. Such as

/usr/lib/x86_64-linux-gnu/,/usr/local/cuda,

do not forget use comma-separated!

3 Likes

@machine.lyc @mialland.a
I have also issue with the build of tensorflow for Windows system:

Repository rule cuda_configure defined at:
  C:/tensorflow/third_party/gpus/cuda_configure.bzl:1399:33: in <toplevel>
ERROR: An error occurred during the fetch of repository 'local_config_cuda':
   Traceback (most recent call last):
        File "C:/tensorflow/third_party/gpus/cuda_configure.bzl", line 1369, column 38, in _cuda_autoconf_impl
                _create_local_cuda_repository(repository_ctx)
        File "C:/tensorflow/third_party/gpus/cuda_configure.bzl", line 955, column 35, in _create_local_cuda_repository
                cuda_config = _get_cuda_config(repository_ctx, find_cuda_config_script)
        File "C:/tensorflow/third_party/gpus/cuda_configure.bzl", line 657, column 30, in _get_cuda_config
                config = find_cuda_config(repository_ctx, find_cuda_config_script, ["cuda", "cudnn"])
        File "C:/tensorflow/third_party/gpus/cuda_configure.bzl", line 635, column 41, in find_cuda_config
                exec_result = _exec_find_cuda_config(repository_ctx, script_path, cuda_libraries)
        File "C:/tensorflow/third_party/gpus/cuda_configure.bzl", line 629, column 19, in _exec_find_cuda_config
                return execute(repository_ctx, [python_bin, "-c", decompress_and_execute_cmd])
        File "C:/tensorflow/third_party/remote_config/common.bzl", line 208, column 13, in execute
                fail(
Error in fail: Repository command failed
Could not find any cudnn.h, cudnn_version.h matching version '' in any subdirectory:
        ''
        'include'
        'include/cuda'
        'include/*-linux-gnu'
        'extras/CUPTI/include'
        'include/cuda/CUPTI'

I even added the following in third_party/gpus/find_cuda_config.py:

  cuda_path = "C:/Program\ Files/NVIDIA\ GPU\ Computing\ Toolkit/CUDA/v10.1"
  cudnn_path = "C:/tools/cuda"

But it helped only on python ./configure.py stage, but on bazel build //tensorflow/tools/pip_package:build_pip_package it is failed

@machine.lyc @mialland.a

I was able to build tensorflow, but got the following error:

PS C:\tensorflow> bazel build --config=opt --config=cuda --define=no_tensorflow_py_deps=true //tensorflow/tools/pip_package:build_pip_package
WARNING: The following configs were expanded more than once: [cuda, using_cuda]. For repeatable flags, repeats are counted twice and may lead to unexpected behavior.
INFO: Options provided by the client:
  Inherited 'common' options: --isatty=1 --terminal_columns=157
INFO: Reading rc options for 'build' from c:\tensorflow\.bazelrc:
  Inherited 'common' options: --experimental_repo_remote_exec
INFO: Options provided by the client:
  'build' options: --python_path=C:/Python/CPython38/python.exe
INFO: Reading rc options for 'build' from c:\tensorflow\.bazelrc:
  'build' options: --apple_platform_type=macos --define framework_shared_object=true --define open_source_build=true --java_toolchain=//third_party/toolchains/java:tf_java_toolchain --host_java_toolchain=//third_party/toolchains/java:tf_java_toolchain --define=use_fast_cpp_protos=true --define=allow_oversize_protos=true --spawn_strategy=standalone -c opt --announce_rc --define=grpc_no_ares=true --noincompatible_remove_legacy_whole_archive --noincompatible_prohibit_aapt1 --enable_platform_specific_config --config=v2
INFO: Reading rc options for 'build' from c:\tensorflow\.tf_configure.bazelrc:
  'build' options: --action_env PYTHON_BIN_PATH=C:/Python/CPython38/python.exe --action_env PYTHON_LIB_PATH=C:/Python/CPython38/lib/site-packages --python_path=C:/Python/CPython38/python.exe --config=xla --action_env TF_CUDA_VERSION=10.1 --action_env TF_CUDNN_VERSION=7.6.5 --action_env TF_CUDA_PATHS=C:/tools/cuda,C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v10.1 --action_env CUDA_TOOLKIT_PATH=C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v10.1 --action_env TF_CUDA_COMPUTE_CAPABILITIES=3.8 --config=cuda --define=override_eigen_strong_inline=true --action_env TF_CONFIGURE_IOS=0
INFO: Found applicable config definition build:v2 in file c:\tensorflow\.bazelrc: --define=tf_api_version=2 --action_env=TF2_BEHAVIOR=1
INFO: Found applicable config definition build:xla in file c:\tensorflow\.bazelrc: --action_env=TF_ENABLE_XLA=1 --define=with_xla_support=true
INFO: Found applicable config definition build:cuda in file c:\tensorflow\.bazelrc: --config=using_cuda --define=using_cuda_nvcc=true
INFO: Found applicable config definition build:using_cuda in file c:\tensorflow\.bazelrc: --define=using_cuda=true --action_env TF_NEED_CUDA=1 --crosstool_top=@local_config_cuda//crosstool:toolchain
INFO: Found applicable config definition build:opt in file c:\tensorflow\.tf_configure.bazelrc: --copt=/arch:AVX --define with_default_optimizations=true
INFO: Found applicable config definition build:cuda in file c:\tensorflow\.bazelrc: --config=using_cuda --define=using_cuda_nvcc=true
INFO: Found applicable config definition build:using_cuda in file c:\tensorflow\.bazelrc: --define=using_cuda=true --action_env TF_NEED_CUDA=1 --crosstool_top=@local_config_cuda//crosstool:toolchain
INFO: Found applicable config definition build:windows in file c:\tensorflow\.bazelrc: --copt=/w --copt=/D_USE_MATH_DEFINES --host_copt=/D_USE_MATH_DEFINES --cxxopt=/std:c++14 --host_cxxopt=/std:c++14 --config=monolithic --copt=-DWIN32_LEAN_AND_MEAN --host_copt=-DWIN32_LEAN_AND_MEAN --copt=-DNOGDI --host_copt=-DNOGDI --linkopt=/DEBUG --host_linkopt=/DEBUG --linkopt=/OPT:REF --host_linkopt=/OPT:REF --linkopt=/OPT:ICF --host_linkopt=/OPT:ICF --experimental_strict_action_env=true --verbose_failures --distinct_host_configuration=false
INFO: Found applicable config definition build:monolithic in file c:\tensorflow\.bazelrc: --define framework_shared_object=false
WARNING: Download from https://storage.googleapis.com/mirror.tensorflow.org/github.com/llvm/llvm-project/archive/7e825abd5704ce28b166f9463d4bd304348fd2a9.tar.gz failed: class com.google.devtools.build.lib.bazel.repository.downloader.UnrecoverableHttpException GET returned 404 Not Found
WARNING: C:/tensorflow/tensorflow/core/BUILD:1749:11: in linkstatic attribute of cc_library rule //tensorflow/core:lib_internal: setting 'linkstatic=1' is recommended if there are no object files. Since this rule was created by the macro 'cc_library', the error might have been caused by the macro implementation
WARNING: C:/tensorflow/tensorflow/core/BUILD:2161:16: in linkstatic attribute of cc_library rule //tensorflow/core:framework_internal: setting 'linkstatic=1' is recommended if there are no object files. Since this rule was created by the macro 'tf_cuda_library', the error might have been caused by the macro implementation
WARNING: Download from https://mirror.bazel.build/github.com/aws/aws-sdk-cpp/archive/1.7.336.tar.gz failed: class com.google.devtools.build.lib.bazel.repository.downloader.UnrecoverableHttpException GET returned 404 Not Found
WARNING: C:/tensorflow/tensorflow/core/BUILD:1774:11: in linkstatic attribute of cc_library rule //tensorflow/core:lib_headers_for_pybind: setting 'linkstatic=1' is recommended if there are no object files. Since this rule was created by the macro 'cc_library', the error might have been caused by the macro implementation
WARNING: C:/tensorflow/tensorflow/python/BUILD:4662:11: in py_library rule //tensorflow/python:standard_ops: target '//tensorflow/python:standard_ops' depends on deprecated target '//tensorflow/python/ops/distributions:distributions': TensorFlow Distributions has migrated to TensorFlow Probability (https://github.com/tensorflow/probability). Deprecated copies remaining in tf.distributions will not receive new features, and will be removed by early 2019. You should update all usage of `tf.distributions` to `tfp.distributions`.
WARNING: C:/tensorflow/tensorflow/python/BUILD:115:11: in py_library rule //tensorflow/python:no_contrib: target '//tensorflow/python:no_contrib' depends on deprecated target '//tensorflow/python/ops/distributions:distributions': TensorFlow Distributions has migrated to TensorFlow Probability (https://github.com/tensorflow/probability). Deprecated copies remaining in tf.distributions will not receive new features, and will be removed by early 2019. You should update all usage of `tf.distributions` to `tfp.distributions`.
INFO: Analyzed target //tensorflow/tools/pip_package:build_pip_package (0 packages loaded, 0 targets configured).
INFO: Found 1 target...
ERROR: C:/tensorflow/tensorflow/core/framework/BUILD:1107:31: Executing genrule //tensorflow/core/framework:attr_value_proto_text_srcs failed (Exit 126): bash.exe failed: error executing command
  cd C:/users/{user}/_bazel_{user}/xv6zejqw/execroot/org_tensorflow
  SET CUDA_TOOLKIT_PATH=C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v10.1
    SET PATH=C:\msys64\usr\bin;C:\msys64\bin;C:\WINDOWS;C:\WINDOWS\System32;C:\WINDOWS\System32\WindowsPowerShell\v1.0
    SET PYTHON_BIN_PATH=C:/Python/CPython38/python.exe
    SET PYTHON_LIB_PATH=C:/Python/CPython38/lib/site-packages
    SET RUNFILES_MANIFEST_ONLY=1
    SET TF2_BEHAVIOR=1
    SET TF_CONFIGURE_IOS=0
    SET TF_CUDA_COMPUTE_CAPABILITIES=3.8
    SET TF_CUDA_PATHS=C:/tools/cuda,C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v10.1
    SET TF_CUDA_VERSION=10.1
    SET TF_CUDNN_VERSION=7.6.5
    SET TF_ENABLE_XLA=1
    SET TF_NEED_CUDA=1
  C:/msys64/usr/bin/bash.exe -c source external/bazel_tools/tools/genrule/genrule-setup.sh; bazel-out/x64_windows-opt-exec-50AE0418/bin/tensorflow/tools/proto_text/gen_proto_text_functions bazel-out/x64_windows-opt/bin/tensorflow/core/framework tensorflow/core/framework/ tensorflow/core/framework/attr_value.proto tensorflow/core/framework/resource_handle.proto tensorflow/core/framework/tensor.proto tensorflow/core/framework/tensor_shape.proto tensorflow/core/framework/types.proto tensorflow/tools/proto_text/placeholder.txt
Execution platform: @local_execution_config_platform//:platform
/usr/bin/bash: bazel-out/x64_windows-opt-exec-50AE0418/bin/tensorflow/tools/proto_text/gen_proto_text_functions: Bad address
Target //tensorflow/tools/pip_package:build_pip_package failed to build
ERROR: C:/tensorflow/tensorflow/python/BUILD:1419:27 Executing genrule //tensorflow/core/framework:attr_value_proto_text_srcs failed (Exit 126): bash.exe failed: error executing command
  cd C:/users/{user}/_bazel_{user}/xv6zejqw/execroot/org_tensorflow
  SET CUDA_TOOLKIT_PATH=C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v10.1
    SET PATH=C:\msys64\usr\bin;C:\msys64\bin;C:\WINDOWS;C:\WINDOWS\System32;C:\WINDOWS\System32\WindowsPowerShell\v1.0
    SET PYTHON_BIN_PATH=C:/Python/CPython38/python.exe
    SET PYTHON_LIB_PATH=C:/Python/CPython38/lib/site-packages
    SET RUNFILES_MANIFEST_ONLY=1
    SET TF2_BEHAVIOR=1
    SET TF_CONFIGURE_IOS=0
    SET TF_CUDA_COMPUTE_CAPABILITIES=3.8
    SET TF_CUDA_PATHS=C:/tools/cuda,C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v10.1
    SET TF_CUDA_VERSION=10.1
    SET TF_CUDNN_VERSION=7.6.5
    SET TF_ENABLE_XLA=1
    SET TF_NEED_CUDA=1
  C:/msys64/usr/bin/bash.exe -c source external/bazel_tools/tools/genrule/genrule-setup.sh; bazel-out/x64_windows-opt-exec-50AE0418/bin/tensorflow/tools/proto_text/gen_proto_text_functions bazel-out/x64_windows-opt/bin/tensorflow/core/framework tensorflow/core/framework/ tensorflow/core/framework/attr_value.proto tensorflow/core/framework/resource_handle.proto tensorflow/core/framework/tensor.proto tensorflow/core/framework/tensor_shape.proto tensorflow/core/framework/types.proto tensorflow/tools/proto_text/placeholder.txt
Execution platform: @local_execution_config_platform//:platform
INFO: Elapsed time: 3.038s, Critical Path: 0.39s
INFO: 21 processes: 17 internal, 4 local.
FAILED: Build did NOT complete successfully