CUDA 10.1 - tensorflow build configuration fails

Hello,

I want to build tensorflow 1.14 using CUDA 10.1.168 and CUDNN 7.6.1.34 on Manjaro. But when launching ./configure, it fails when I type CUDA’s version. I have been through :

sudo pacman -S cuda cudnn
./bazel_0.24.1_installer_linux_x86_64.sh --user
export PATH='$PATH:$HOME/bin'
git clone https://github.com/tensorflow/tensorflow.git
cd tensorflow
git checkout r1.14
./configure

and then I got the following :

WARNING: --batch mode is deprecated. Please instead explicitly shut down your Bazel server using the command "bazel shutdown".
You have bazel 0.24.1 installed.
Please specify the location of python. [Default is /usr/bin/python]: 

Found possible Python library paths:
  /usr/lib/python3.7/site-packages
Please input the desired Python library path to use.  Default is [/usr/lib/python3.7/site-packages]

Do you wish to build TensorFlow with XLA JIT support? [Y/n]: 
XLA JIT support will be enabled for TensorFlow.

Do you wish to build TensorFlow with OpenCL SYCL support? [y/N]: 
No OpenCL SYCL support will be enabled for TensorFlow.

Do you wish to build TensorFlow with ROCm support? [y/N]: 
No ROCm support will be enabled for TensorFlow.

Do you wish to build TensorFlow with CUDA support? [y/N]: y
CUDA support will be enabled for TensorFlow.

Do you wish to build TensorFlow with TensorRT support? [y/N]: 
No TensorRT support will be enabled for TensorFlow.

Could not find any cuda.h matching version '' in any subdirectory:
        ''
        'include'
        'include/cuda'
        'include/*-linux-gnu'
        'extras/CUPTI/include'
        'include/cuda/CUPTI'
of:
        '/opt/cuda/extras/CUPTI/lib64'
        '/opt/cuda/lib64'
        '/opt/cuda/nvvm/lib64'
        '/usr'
        '/usr/lib'
        '/usr/lib/libfakeroot'
        '/usr/lib32'
Asking for detailed CUDA configuration...

Please specify the CUDA SDK version you want to use. [Leave empty to default to CUDA 10]: 10.1.168

Please specify the cuDNN version you want to use. [Leave empty to default to cuDNN 7]: 7.6.1.34

Please specify the locally installed NCCL version you want to use. [Leave empty to use http://github.com/nvidia/nccl]: 1.3

Please specify the comma-separated list of base paths to look for CUDA libraries and headers. [Leave empty to use the default]: 5.0

Could not find any cuda.h matching version '10.1.168' in any subdirectory:
        ''
        'include'
        'include/cuda'
        'include/*-linux-gnu'
        'extras/CUPTI/include'
        'include/cuda/CUPTI'
of:
Asking for detailed CUDA configuration...

Please specify the CUDA SDK version you want to use. [Leave empty to default to CUDA 10]:

So, I cannot configure the build. I tried to type 10.1 istead of 10.1.168, or event choose the default option. Is there something I have to do before ?

1 Like

Hi, I have solved it.

find -name cuda.h

after you find it, paste it on Please specify the comma-separated list of base paths to look for CUDA libraries and headers. Such as

/usr/lib/x86_64-linux-gnu/,/usr/local/cuda,

do not forget use comma-separated!

4 Likes

@machine.lyc @mialland.a
I have also issue with the build of tensorflow for Windows system:

Repository rule cuda_configure defined at:
  C:/tensorflow/third_party/gpus/cuda_configure.bzl:1399:33: in <toplevel>
ERROR: An error occurred during the fetch of repository 'local_config_cuda':
   Traceback (most recent call last):
        File "C:/tensorflow/third_party/gpus/cuda_configure.bzl", line 1369, column 38, in _cuda_autoconf_impl
                _create_local_cuda_repository(repository_ctx)
        File "C:/tensorflow/third_party/gpus/cuda_configure.bzl", line 955, column 35, in _create_local_cuda_repository
                cuda_config = _get_cuda_config(repository_ctx, find_cuda_config_script)
        File "C:/tensorflow/third_party/gpus/cuda_configure.bzl", line 657, column 30, in _get_cuda_config
                config = find_cuda_config(repository_ctx, find_cuda_config_script, ["cuda", "cudnn"])
        File "C:/tensorflow/third_party/gpus/cuda_configure.bzl", line 635, column 41, in find_cuda_config
                exec_result = _exec_find_cuda_config(repository_ctx, script_path, cuda_libraries)
        File "C:/tensorflow/third_party/gpus/cuda_configure.bzl", line 629, column 19, in _exec_find_cuda_config
                return execute(repository_ctx, [python_bin, "-c", decompress_and_execute_cmd])
        File "C:/tensorflow/third_party/remote_config/common.bzl", line 208, column 13, in execute
                fail(
Error in fail: Repository command failed
Could not find any cudnn.h, cudnn_version.h matching version '' in any subdirectory:
        ''
        'include'
        'include/cuda'
        'include/*-linux-gnu'
        'extras/CUPTI/include'
        'include/cuda/CUPTI'

I even added the following in third_party/gpus/find_cuda_config.py:

  cuda_path = "C:/Program\ Files/NVIDIA\ GPU\ Computing\ Toolkit/CUDA/v10.1"
  cudnn_path = "C:/tools/cuda"

But it helped only on python ./configure.py stage, but on bazel build //tensorflow/tools/pip_package:build_pip_package it is failed

@machine.lyc @mialland.a

I was able to build tensorflow, but got the following error:

PS C:\tensorflow> bazel build --config=opt --config=cuda --define=no_tensorflow_py_deps=true //tensorflow/tools/pip_package:build_pip_package
WARNING: The following configs were expanded more than once: [cuda, using_cuda]. For repeatable flags, repeats are counted twice and may lead to unexpected behavior.
INFO: Options provided by the client:
  Inherited 'common' options: --isatty=1 --terminal_columns=157
INFO: Reading rc options for 'build' from c:\tensorflow\.bazelrc:
  Inherited 'common' options: --experimental_repo_remote_exec
INFO: Options provided by the client:
  'build' options: --python_path=C:/Python/CPython38/python.exe
INFO: Reading rc options for 'build' from c:\tensorflow\.bazelrc:
  'build' options: --apple_platform_type=macos --define framework_shared_object=true --define open_source_build=true --java_toolchain=//third_party/toolchains/java:tf_java_toolchain --host_java_toolchain=//third_party/toolchains/java:tf_java_toolchain --define=use_fast_cpp_protos=true --define=allow_oversize_protos=true --spawn_strategy=standalone -c opt --announce_rc --define=grpc_no_ares=true --noincompatible_remove_legacy_whole_archive --noincompatible_prohibit_aapt1 --enable_platform_specific_config --config=v2
INFO: Reading rc options for 'build' from c:\tensorflow\.tf_configure.bazelrc:
  'build' options: --action_env PYTHON_BIN_PATH=C:/Python/CPython38/python.exe --action_env PYTHON_LIB_PATH=C:/Python/CPython38/lib/site-packages --python_path=C:/Python/CPython38/python.exe --config=xla --action_env TF_CUDA_VERSION=10.1 --action_env TF_CUDNN_VERSION=7.6.5 --action_env TF_CUDA_PATHS=C:/tools/cuda,C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v10.1 --action_env CUDA_TOOLKIT_PATH=C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v10.1 --action_env TF_CUDA_COMPUTE_CAPABILITIES=3.8 --config=cuda --define=override_eigen_strong_inline=true --action_env TF_CONFIGURE_IOS=0
INFO: Found applicable config definition build:v2 in file c:\tensorflow\.bazelrc: --define=tf_api_version=2 --action_env=TF2_BEHAVIOR=1
INFO: Found applicable config definition build:xla in file c:\tensorflow\.bazelrc: --action_env=TF_ENABLE_XLA=1 --define=with_xla_support=true
INFO: Found applicable config definition build:cuda in file c:\tensorflow\.bazelrc: --config=using_cuda --define=using_cuda_nvcc=true
INFO: Found applicable config definition build:using_cuda in file c:\tensorflow\.bazelrc: --define=using_cuda=true --action_env TF_NEED_CUDA=1 --crosstool_top=@local_config_cuda//crosstool:toolchain
INFO: Found applicable config definition build:opt in file c:\tensorflow\.tf_configure.bazelrc: --copt=/arch:AVX --define with_default_optimizations=true
INFO: Found applicable config definition build:cuda in file c:\tensorflow\.bazelrc: --config=using_cuda --define=using_cuda_nvcc=true
INFO: Found applicable config definition build:using_cuda in file c:\tensorflow\.bazelrc: --define=using_cuda=true --action_env TF_NEED_CUDA=1 --crosstool_top=@local_config_cuda//crosstool:toolchain
INFO: Found applicable config definition build:windows in file c:\tensorflow\.bazelrc: --copt=/w --copt=/D_USE_MATH_DEFINES --host_copt=/D_USE_MATH_DEFINES --cxxopt=/std:c++14 --host_cxxopt=/std:c++14 --config=monolithic --copt=-DWIN32_LEAN_AND_MEAN --host_copt=-DWIN32_LEAN_AND_MEAN --copt=-DNOGDI --host_copt=-DNOGDI --linkopt=/DEBUG --host_linkopt=/DEBUG --linkopt=/OPT:REF --host_linkopt=/OPT:REF --linkopt=/OPT:ICF --host_linkopt=/OPT:ICF --experimental_strict_action_env=true --verbose_failures --distinct_host_configuration=false
INFO: Found applicable config definition build:monolithic in file c:\tensorflow\.bazelrc: --define framework_shared_object=false
WARNING: Download from https://storage.googleapis.com/mirror.tensorflow.org/github.com/llvm/llvm-project/archive/7e825abd5704ce28b166f9463d4bd304348fd2a9.tar.gz failed: class com.google.devtools.build.lib.bazel.repository.downloader.UnrecoverableHttpException GET returned 404 Not Found
WARNING: C:/tensorflow/tensorflow/core/BUILD:1749:11: in linkstatic attribute of cc_library rule //tensorflow/core:lib_internal: setting 'linkstatic=1' is recommended if there are no object files. Since this rule was created by the macro 'cc_library', the error might have been caused by the macro implementation
WARNING: C:/tensorflow/tensorflow/core/BUILD:2161:16: in linkstatic attribute of cc_library rule //tensorflow/core:framework_internal: setting 'linkstatic=1' is recommended if there are no object files. Since this rule was created by the macro 'tf_cuda_library', the error might have been caused by the macro implementation
WARNING: Download from https://mirror.bazel.build/github.com/aws/aws-sdk-cpp/archive/1.7.336.tar.gz failed: class com.google.devtools.build.lib.bazel.repository.downloader.UnrecoverableHttpException GET returned 404 Not Found
WARNING: C:/tensorflow/tensorflow/core/BUILD:1774:11: in linkstatic attribute of cc_library rule //tensorflow/core:lib_headers_for_pybind: setting 'linkstatic=1' is recommended if there are no object files. Since this rule was created by the macro 'cc_library', the error might have been caused by the macro implementation
WARNING: C:/tensorflow/tensorflow/python/BUILD:4662:11: in py_library rule //tensorflow/python:standard_ops: target '//tensorflow/python:standard_ops' depends on deprecated target '//tensorflow/python/ops/distributions:distributions': TensorFlow Distributions has migrated to TensorFlow Probability (https://github.com/tensorflow/probability). Deprecated copies remaining in tf.distributions will not receive new features, and will be removed by early 2019. You should update all usage of `tf.distributions` to `tfp.distributions`.
WARNING: C:/tensorflow/tensorflow/python/BUILD:115:11: in py_library rule //tensorflow/python:no_contrib: target '//tensorflow/python:no_contrib' depends on deprecated target '//tensorflow/python/ops/distributions:distributions': TensorFlow Distributions has migrated to TensorFlow Probability (https://github.com/tensorflow/probability). Deprecated copies remaining in tf.distributions will not receive new features, and will be removed by early 2019. You should update all usage of `tf.distributions` to `tfp.distributions`.
INFO: Analyzed target //tensorflow/tools/pip_package:build_pip_package (0 packages loaded, 0 targets configured).
INFO: Found 1 target...
ERROR: C:/tensorflow/tensorflow/core/framework/BUILD:1107:31: Executing genrule //tensorflow/core/framework:attr_value_proto_text_srcs failed (Exit 126): bash.exe failed: error executing command
  cd C:/users/{user}/_bazel_{user}/xv6zejqw/execroot/org_tensorflow
  SET CUDA_TOOLKIT_PATH=C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v10.1
    SET PATH=C:\msys64\usr\bin;C:\msys64\bin;C:\WINDOWS;C:\WINDOWS\System32;C:\WINDOWS\System32\WindowsPowerShell\v1.0
    SET PYTHON_BIN_PATH=C:/Python/CPython38/python.exe
    SET PYTHON_LIB_PATH=C:/Python/CPython38/lib/site-packages
    SET RUNFILES_MANIFEST_ONLY=1
    SET TF2_BEHAVIOR=1
    SET TF_CONFIGURE_IOS=0
    SET TF_CUDA_COMPUTE_CAPABILITIES=3.8
    SET TF_CUDA_PATHS=C:/tools/cuda,C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v10.1
    SET TF_CUDA_VERSION=10.1
    SET TF_CUDNN_VERSION=7.6.5
    SET TF_ENABLE_XLA=1
    SET TF_NEED_CUDA=1
  C:/msys64/usr/bin/bash.exe -c source external/bazel_tools/tools/genrule/genrule-setup.sh; bazel-out/x64_windows-opt-exec-50AE0418/bin/tensorflow/tools/proto_text/gen_proto_text_functions bazel-out/x64_windows-opt/bin/tensorflow/core/framework tensorflow/core/framework/ tensorflow/core/framework/attr_value.proto tensorflow/core/framework/resource_handle.proto tensorflow/core/framework/tensor.proto tensorflow/core/framework/tensor_shape.proto tensorflow/core/framework/types.proto tensorflow/tools/proto_text/placeholder.txt
Execution platform: @local_execution_config_platform//:platform
/usr/bin/bash: bazel-out/x64_windows-opt-exec-50AE0418/bin/tensorflow/tools/proto_text/gen_proto_text_functions: Bad address
Target //tensorflow/tools/pip_package:build_pip_package failed to build
ERROR: C:/tensorflow/tensorflow/python/BUILD:1419:27 Executing genrule //tensorflow/core/framework:attr_value_proto_text_srcs failed (Exit 126): bash.exe failed: error executing command
  cd C:/users/{user}/_bazel_{user}/xv6zejqw/execroot/org_tensorflow
  SET CUDA_TOOLKIT_PATH=C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v10.1
    SET PATH=C:\msys64\usr\bin;C:\msys64\bin;C:\WINDOWS;C:\WINDOWS\System32;C:\WINDOWS\System32\WindowsPowerShell\v1.0
    SET PYTHON_BIN_PATH=C:/Python/CPython38/python.exe
    SET PYTHON_LIB_PATH=C:/Python/CPython38/lib/site-packages
    SET RUNFILES_MANIFEST_ONLY=1
    SET TF2_BEHAVIOR=1
    SET TF_CONFIGURE_IOS=0
    SET TF_CUDA_COMPUTE_CAPABILITIES=3.8
    SET TF_CUDA_PATHS=C:/tools/cuda,C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v10.1
    SET TF_CUDA_VERSION=10.1
    SET TF_CUDNN_VERSION=7.6.5
    SET TF_ENABLE_XLA=1
    SET TF_NEED_CUDA=1
  C:/msys64/usr/bin/bash.exe -c source external/bazel_tools/tools/genrule/genrule-setup.sh; bazel-out/x64_windows-opt-exec-50AE0418/bin/tensorflow/tools/proto_text/gen_proto_text_functions bazel-out/x64_windows-opt/bin/tensorflow/core/framework tensorflow/core/framework/ tensorflow/core/framework/attr_value.proto tensorflow/core/framework/resource_handle.proto tensorflow/core/framework/tensor.proto tensorflow/core/framework/tensor_shape.proto tensorflow/core/framework/types.proto tensorflow/tools/proto_text/placeholder.txt
Execution platform: @local_execution_config_platform//:platform
INFO: Elapsed time: 3.038s, Critical Path: 0.39s
INFO: 21 processes: 17 internal, 4 local.
FAILED: Build did NOT complete successfully

you may need to “install” cudnn first. The instructions is [here]( cuDNN Installation Guide :: NVIDIA Deep Learning SDK Documentation).

It is not issue with not installed CUDA, it is issue with configuration of TensorFlow for building in Windows environment

hello ,have you fix this problem, I have this problem to