Building Tensorflow 1.13 on Jetson Xavier

cmehrshad · June 5, 2019, 10:06pm

Hello All,

I was struggling a lot building tensorflow on Jetson Xavier and I couldn’t find a working script which would guide through everything so I searched a lot and tried different things for days and finally was successful to build it from source. So I am going to share what I did here and hopefully it helps people who want to do the same in future. I have tried to specify all the steps I have done but I might have forgotten few things so please feel free to add anything related which improves the approach here.

System Setup

Product: Jetson AGX Xavier
JetPack: 4.2
TensorFlow: 1.13
Cuda: 10.0
Compute Capability: 7.2
Cudnn: 7.4
TensorRT: 5.0.6
Python: 3.6
bazel: 0.19.2
gcc used for building: 5.5.0
pip: 1.19.1

Building bazel

Install java if you haven’t already done so

sudo apt-get install openjdk-8-jdk

Download dist release 0.19.2 of bazel (bazel-0.19.2-dist.zip) from bazel’s build website
Unpack the downloaded file

unzip bazel-0.19.2-dist.zip

cd to the unzipped directory and build bazel

env EXTRA_BAZEL_ARGS="--host_javabase=@local_jdk//:jdk" bash ./compile.sh.

The output should be produced in output/bazel. Feel free to add this binary to your environment, i.e. ~/.bashrc:

vim ~/.bashrc
export PATH=/pathToYourBazelDirectory/output${PATH:+:${PATH}} # add this at the end of your file

Building Tensorflow

Download sources

git clone https://github.com/tensorflow/tensorflow.git
cd tensorflow

Checkout tensorflow version

git checkout r1.13

apply this patch for our arm architecture

diff --git a/tensorflow/lite/kernels/internal/BUILD b/tensorflow/lite/kernels/internal/BUILD
index 4be3226938..7226f96fdf 100644
--- a/tensorflow/lite/kernels/internal/BUILD
+++ b/tensorflow/lite/kernels/internal/BUILD
@@ -22,15 +22,12 @@ HARD_FP_FLAGS_IF_APPLICABLE = select({
 NEON_FLAGS_IF_APPLICABLE = select({
     ":arm": [
         "-O3",
-        "-mfpu=neon",
     ],
     ":armeabi-v7a": [
         "-O3",
-        "-mfpu=neon",
     ],
     ":armv7a": [
         "-O3",
-        "-mfpu=neon",
     ],
     "//conditions:default": [
         "-O3",

diff --git a/third_party/aws/BUILD.bazel b/third_party/aws/BUILD.bazel
index 5426f79e46..e08f8fc108 100644
--- a/third_party/aws/BUILD.bazel
+++ b/third_party/aws/BUILD.bazel
@@ -24,7 +24,7 @@ cc_library(
         "@org_tensorflow//tensorflow:raspberry_pi_armeabi": glob([
             "aws-cpp-sdk-core/source/platform/linux-shared/*.cpp",
         ]),
-        "//conditions:default": [],
+        "//conditions:default": glob(["aws-cpp-sdk-core/source/platform/linux-shared/*.cpp",]),
     }) + glob([
         "aws-cpp-sdk-core/include/**/*.h",
         "aws-cpp-sdk-core/source/*.cpp",

diff --git a/third_party/gpus/crosstool/BUILD.tpl b/third_party/gpus/crosstool/BUILD.tpl
index db76306ffb..184cd35b87 100644
--- a/third_party/gpus/crosstool/BUILD.tpl
+++ b/third_party/gpus/crosstool/BUILD.tpl
@@ -24,6 +24,7 @@ cc_toolchain_suite(
         "x64_windows|msvc-cl": ":cc-compiler-windows",
         "x64_windows": ":cc-compiler-windows",
         "arm": ":cc-compiler-local",
+        "aarch64": ":cc-compiler-local",
         "k8": ":cc-compiler-local",
         "piii": ":cc-compiler-local",
         "ppc": ":cc-compiler-local",

Install older gcc:

sudo apt-get install g++-5
sudo apt-get install gcc-5

Note: the problem with gcc is that it didn’t work for the default 7.4. It also didn’t work for 4.8 or 8. This is the only version I could finally build with.

Create Swap

$ fallocate -l 8G swapfile
$ ls -lh swapfile
$ sudo chmod 600 swapfile
$ ls -lh swapfile
$ sudo mkswap swapfile
$ sudo swapon swapfile
$ swapon -s

Configure system build

$./configure
Please specify the location of python. [Default is /usr/bin/python]:/usr/bin/python3

Found possible Python library paths:
  /usr/local/lib/python3.6/dist-packages
  /usr/lib/python3.6/dist-packages
Please input the desired Python library path to use.  Default is [/usr/local/lib/python3.6/dist-packages]

Do you wish to build TensorFlow with XLA JIT support? [Y/n]: n
No XLA JIT support will be enabled for TensorFlow.

Do you wish to build TensorFlow with OpenCL SYCL support? [y/N]: n
No OpenCL SYCL support will be enabled for TensorFlow.

Do you wish to build TensorFlow with ROCm support? [y/N]: n
No ROCm support will be enabled for TensorFlow.

Do you wish to build TensorFlow with CUDA support? [y/N]: y
CUDA support will be enabled for TensorFlow.

Please specify the CUDA SDK version you want to use. [Leave empty to default to CUDA 10.0]:

Please specify the location where CUDA 10.0 toolkit is installed. Refer to README.md for more details. [Default is /usr/local/cuda]:/usr/local/cuda-10.0

Please specify the cuDNN version you want to use. [Leave empty to default to cuDNN 7]:7.4

Please specify the location where cuDNN 7 library is installed. Refer to README.md for more details. [Default is /usr/local/cuda]:/usr/lib/aarch64-linux-gnu

Do you wish to build TensorFlow with TensorRT support? [y/N]: y
TensorRT support will be enabled for TensorFlow.

Please specify the location where TensorRT is installed. [Default is /usr/lib/aarch64-linux-gnu]:

Please specify the locally installed NCCL version you want to use. [Default is to use https://github.com/nvidia/nccl]:

Please specify a list of comma-separated Cuda compute capabilities you want to build with.
You can find the compute capability of your device at: https://developer.nvidia.com/cuda-gpus.
Please note that each additional compute capability significantly increases your build time and binary size. [Default is: 3.5,7.0]: 7.2

Do you want to use clang as CUDA compiler? [y/N]: n
nvcc will be used as CUDA compiler. 
    
Please specify which gcc should be used by nvcc as the host compiler. [Default is /usr/bin/gcc]:/usr/bin/gcc-5

Do you wish to build TensorFlow with MPI support? [y/N]: n
No MPI support will be enabled for TensorFlow.

Please specify optimization flags to use during compilation when bazel option "--config=opt" is specified [Default is -march=native -Wno-sign-compare]:

Would you like to interactively configure ./WORKSPACE for Android builds? [y/N]: n
Not configuring the WORKSPACE for Android builds.

Preconfigured Bazel build configs. You can use any of the below by adding "--config=<>" to your build command. See .bazelrc for more details.
        --config=mkl            # Build with MKL support.
        --config=monolithic     # Config for mostly static monolithic build.
        --config=gdr            # Build with GDR support.
        --config=verbs          # Build with libverbs support.
        --config=ngraph         # Build with Intel nGraph support.
        --config=dynamic_kernels        # (Experimental) Build kernels into separate shared objects.
Preconfigured Bazel build configs to DISABLE default on features:
        --config=noaws          # Disable AWS S3 filesystem support.
        --config=nogcp          # Disable GCP support.
        --config=nohdfs         # Disable HDFS support.
        --config=noignite       # Disable Apacha Ignite support.
        --config=nokafka        # Disable Apache Kafka support.
        --config=nonccl         # Disable NVIDIA NCCL support.
Configuration finished

Now that you have configured your system build, let start building:

bazel build --config=opt --config=nonccl //tensorflow/tools/pip_package:build_pip_package --incompatible_remove_native_http_archive=false --verbose_failures --cxxopt="-D_GLIBCXX_USE_CXX11_ABI=0"

After the build was hopefully finished after 4.5 hours, you can now use it to build the package:

sudo bazel-bin/tensorflow/tools/pip_package/build_pip_package ../

Install the wheel file generated

sudo pip install ../tensorflow-1.13.1-cp36-cp36m-linux_aarch64.whl

Test!

Testing the python package

$ cd 
$ python3
Python 3.6.7 (default, Oct 22 2018, 11:32:17) 
[GCC 8.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow as tf
>>> sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))
2019-06-05 15:16:56.295371: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:965] ARM64 does not support NUMA - returning NUMA node zero
2019-06-05 15:16:56.295657: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1433] Found device 0 with properties: 
name: Xavier major: 7 minor: 2 memoryClockRate(GHz): 1.5
pciBusID: 0000:00:00.0
totalMemory: 15.33GiB freeMemory: 9.03GiB
2019-06-05 15:16:56.295785: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1512] Adding visible gpu devices: 0
2019-06-05 15:16:57.766675: I tensorflow/core/common_runtime/gpu/gpu_device.cc:984] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-06-05 15:16:57.766865: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990]      0 
2019-06-05 15:16:57.766933: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 0:   N 
2019-06-05 15:16:57.767368: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 8442 MB memory) -> physical GPU (device: 0, name: Xavier, pci bus id: 0000:00:00.0, compute capability: 7.2)
Device mapping:
/job:localhost/replica:0/task:0/device:GPU:0 -> device: 0, name: Xavier, pci bus id: 0000:00:00.0, compute capability: 7.2
2019-06-05 15:16:57.769918: I tensorflow/core/common_runtime/direct_session.cc:317] Device mapping:
/job:localhost/replica:0/task:0/device:GPU:0 -> device: 0, name: Xavier, pci bus id: 0000:00:00.0, compute capability: 7.2

Testing C++:
Using the example provided here, simply follow the instruction and build using the command provided there [you don’t need to run ./configure again though]. Then test your app

$ ./tensorflow/bazel-out/aarch64-opt/bin/tensorflow/cc/example/example
2019-06-05 17:49:16.518159: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:965] ARM64 does not support NUMA - returning NUMA node zero
2019-06-05 17:49:16.518515: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1433] Found device 0 with properties: 
name: Xavier major: 7 minor: 2 memoryClockRate(GHz): 1.5
pciBusID: 0000:00:00.0
totalMemory: 15.33GiB freeMemory: 7.92GiB
2019-06-05 17:49:16.518595: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1512] Adding visible gpu devices: 0
2019-06-05 17:49:16.519415: I tensorflow/core/common_runtime/gpu/gpu_device.cc:984] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-06-05 17:49:16.519504: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990]      0 
2019-06-05 17:49:16.519589: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 0:   N 
2019-06-05 17:49:16.520143: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 7700 MB memory) -> physical GPU (device: 0, name: Xavier, pci bus id: 0000:00:00.0, compute capability: 7.2)
2019-06-05 17:49:18.887245: I tensorflow/cc/example/example.cc:22] 19
-3

So you get the expected results: 19 -3.

EDIT:
Please see the next post for a better approach when using C++ APIs.

Refrences
https://docs.bazel.build/versions/master/install-compile-source.html#bootstrap-bazel
https://www.tensorflow.org/install/source
https://github.com/tensorflow/tensorflow/issues/25323
https://devtalk.nvidia.com/default/topic/1049100/tensorflow-installation-on-drive-px2-/
https://devtalk.nvidia.com/default/topic/1043026/jetson-agx-xavier/building-tensorflow-whl-from-source-for-jetson-agx-solved-/
https://www.tensorflow.org/guide/extend/cc

cmehrshad · June 6, 2019, 3:05am

So I figured out it is not feasible to use bazel to compile the C++ test code mentioned above. Inspired by this website, what we can do is use the shared libraries.

To use libtensorflow_cc for our C++ codes, we should build the shared library (only once):

bazel build --config=opt --config=nonccl //tensorflow:libtensorflow_cc.so --incompatible_remove_native_http_archive=false --verbose_failures --cxxopt="-D_GLIBCXX_USE_CXX11_ABI=0"

Then let’s say the project files are in a directory called project residing besides the root directory of tensorflow:

cd ../project

Then we are going to compile our project

g++-5 -std=gnu++11 -c ./main.cpp -D_GLIBCXX_USE_CXX11_ABI=0 \
    -I../tensorflow \
    -I../tensorflow/bazel-tensorflow/external/eigen_archive \
    -I../tensorflow/bazel-tensorflow/external/protobuf_archive/src \
    -I../tensorflow/bazel-tensorflow/external/com_google_absl \
    -I../tensorflow/bazel-genfiles

And let’s add the shared libraries to our library path

export LD_LIBRARY_PATH=../tensorflow/bazel-bin/tensorflow:$LD_LIBRARY_PATH

Now we can safely link

g++-5 -std=gnu++11 ./main.o -o main -L../tensorflow/bazel-bin/tensorflow -ltensorflow_cc -ltensorflow_framework

And run!

$ ./main 
2019-06-05 22:59:12.912357: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:965] ARM64 does not support NUMA - returning NUMA node zero
2019-06-05 22:59:12.912740: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1433] Found device 0 with properties: 
name: Xavier major: 7 minor: 2 memoryClockRate(GHz): 1.5
pciBusID: 0000:00:00.0
totalMemory: 15.33GiB freeMemory: 4.66GiB
2019-06-05 22:59:12.912937: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1512] Adding visible gpu devices: 0
2019-06-05 22:59:14.301438: I tensorflow/core/common_runtime/gpu/gpu_device.cc:984] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-06-05 22:59:14.301623: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990]      0 
2019-06-05 22:59:14.301721: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 0:   N 
2019-06-05 22:59:14.302247: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 4207 MB memory) -> physical GPU (device: 0, name: Xavier, pci bus id: 0000:00:00.0, compute capability: 7.2)
2019-06-05 22:59:15.386130: I ./main.cpp:22] 19
-3

This way, the recompilation time when developing the codes won’t take tremendous time as in case of using bazel.

AastaLLL · June 6, 2019, 5:42am

Hi,

Thanks for the sharing.

There are lots of user want to build TensorFlow C++ library.
We will redirect them to this post for information. : )

r7vme · June 9, 2019, 9:22am

Thanks a lot @cmehrshad. Saved me a bunch of time.

I also decided to pack libtensorflow_cc.so into wheel package (+50MB) and i’m using cmake module from https://github.com/PatWie/tensorflow-cmake.

To pack libtensorflow_cc.so into wheel

Apply following patch:

diff --git a/tensorflow/tools/pip_package/BUILD b/tensorflow/tools/pip_package/BUILD
index 4ed2f6ce34..705fde60f3 100644
--- a/tensorflow/tools/pip_package/BUILD
+++ b/tensorflow/tools/pip_package/BUILD
@@ -66,6 +66,7 @@ COMMON_PIP_DEPS = [
     "setup.py",
     ":included_headers",
     "//tensorflow:tensorflow_py",
+    "//tensorflow:libtensorflow_cc.so",
     "//tensorflow/examples/tutorials/mnist:package",
     "//tensorflow/lite/python:interpreter_test_data",
     "//tensorflow/lite/python:tflite_convert",

and rerun

bazel build --config=opt --config=nonccl //tensorflow/tools/pip_package:build_pip_package --incompatible_remove_native_http_archive=false --verbose_failures --cxxopt="-D_GLIBCXX_USE_CXX11_ABI=0"
sudo bazel-bin/tensorflow/tools/pip_package/build_pip_package ../ 
sudo pip install --ignore-installed --no-deps  ../tensorflow-1.13.1-cp36-cp36m-linux_aarch64.whl
ls -la /usr/local/lib/python3.6/dist-packages/tensorflow/libtensorflow_cc.so

To use cmake.

Put patched version (1.13 support and libtensorflow_cc.so from wheel support) into “cmake” dir of your project https://github.com/r7vme/tensorflow-cmake/blob/master/cmake/modules/FindTensorFlow.cmake
Make sure your CMakeLists.txt contain following pieces

set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -std=c++11 -D_GLIBCXX_USE_CXX11_ABI=0")
...
# tensorflow
list(APPEND CMAKE_MODULE_PATH ${CMAKE_CURRENT_SOURCE_DIR}/cmake)
set(PYTHON_EXECUTABLE "python3")
find_package(TensorFlow REQUIRED)
TensorFlow_REQUIRE_C_LIBRARY()
...

target_link_libraries(main TensorFlow_DEP)

abhijeet.v · June 12, 2019, 12:30pm

Hello All,

I was struggling a lot building tensorflow on Jetson Xavier and I couldn’t find a working script which would guide through everything so I searched a lot and tried different things for days and finally was successful to build it from source. So I am going to share what I did here and hopefully it helps people who want to do the same in future. I have tried to specify all the steps I have done but I might have forgotten few things so please feel free to add anything related which improves the approach here.

System Setup

Product: Jetson AGX Xavier
JetPack: 4.2
TensorFlow: 1.13
Cuda: 10.0
Compute Capability: 7.2
Cudnn: 7.4
TensorRT: 5.0.6
Python: 3.6
bazel: 0.19.2
gcc used for building: 5.5.0
pip: 1.19.1

Building bazel

Install java if you haven’t already done so

sudo apt-get install openjdk-8-jdk

Download dist release 0.19.2 of bazel (bazel-0.19.2-dist.zip) from bazel’s build website
Unpack the downloaded file

unzip bazel-0.19.2-dist.zip

cd to the unzipped directory and build bazel

env EXTRA_BAZEL_ARGS="--host_javabase=@local_jdk//:jdk" bash ./compile.sh.

The output should be produced in output/bazel. Feel free to add this binary to your environment, i.e. ~/.bashrc:

vim ~/.bashrc
export PATH=/pathToYourBazelDirectory/output${PATH:+:${PATH}} # add this at the end of your file

Building Tensorflow

Download sources

git clone https://github.com/tensorflow/tensorflow.git
cd tensorflow

Checkout tensorflow version

git checkout r1.13

apply this patch for our arm architecture

diff --git a/tensorflow/lite/kernels/internal/BUILD b/tensorflow/lite/kernels/internal/BUILD
index 4be3226938..7226f96fdf 100644
--- a/tensorflow/lite/kernels/internal/BUILD
+++ b/tensorflow/lite/kernels/internal/BUILD
@@ -22,15 +22,12 @@ HARD_FP_FLAGS_IF_APPLICABLE = select({
 NEON_FLAGS_IF_APPLICABLE = select({
     ":arm": [
         "-O3",
-        "-mfpu=neon",
     ],
     ":armeabi-v7a": [
         "-O3",
-        "-mfpu=neon",
     ],
     ":armv7a": [
         "-O3",
-        "-mfpu=neon",
     ],
     "//conditions:default": [
         "-O3",

diff --git a/third_party/aws/BUILD.bazel b/third_party/aws/BUILD.bazel
index 5426f79e46..e08f8fc108 100644
--- a/third_party/aws/BUILD.bazel
+++ b/third_party/aws/BUILD.bazel
@@ -24,7 +24,7 @@ cc_library(
         "@org_tensorflow//tensorflow:raspberry_pi_armeabi": glob([
             "aws-cpp-sdk-core/source/platform/linux-shared/*.cpp",
         ]),
-        "//conditions:default": [],
+        "//conditions:default": glob(["aws-cpp-sdk-core/source/platform/linux-shared/*.cpp",]),
     }) + glob([
         "aws-cpp-sdk-core/include/**/*.h",
         "aws-cpp-sdk-core/source/*.cpp",

diff --git a/third_party/gpus/crosstool/BUILD.tpl b/third_party/gpus/crosstool/BUILD.tpl
index db76306ffb..184cd35b87 100644
--- a/third_party/gpus/crosstool/BUILD.tpl
+++ b/third_party/gpus/crosstool/BUILD.tpl
@@ -24,6 +24,7 @@ cc_toolchain_suite(
         "x64_windows|msvc-cl": ":cc-compiler-windows",
         "x64_windows": ":cc-compiler-windows",
         "arm": ":cc-compiler-local",
+        "aarch64": ":cc-compiler-local",
         "k8": ":cc-compiler-local",
         "piii": ":cc-compiler-local",
         "ppc": ":cc-compiler-local",

Install older gcc:

sudo apt-get install g++-5
sudo apt-get install gcc-5

Note: the problem with gcc is that it didn’t work for the default 7.4. It also didn’t work for 4.8 or 8. This is the only version I could finally build with.

Create Swap

$ fallocate -l 8G swapfile
$ ls -lh swapfile
$ sudo chmod 600 swapfile
$ ls -lh swapfile
$ sudo mkswap swapfile
$ sudo swapon swapfile
$ swapon -s

Configure system build

$./configure
Please specify the location of python. [Default is /usr/bin/python]:/usr/bin/python3


Found possible Python library paths:
  /usr/local/lib/python3.6/dist-packages
  /usr/lib/python3.6/dist-packages
Please input the desired Python library path to use.  Default is [/usr/local/lib/python3.6/dist-packages]

Do you wish to build TensorFlow with XLA JIT support? [Y/n]: n
No XLA JIT support will be enabled for TensorFlow.

Do you wish to build TensorFlow with OpenCL SYCL support? [y/N]: n
No OpenCL SYCL support will be enabled for TensorFlow.

Do you wish to build TensorFlow with ROCm support? [y/N]: n
No ROCm support will be enabled for TensorFlow.

Do you wish to build TensorFlow with CUDA support? [y/N]: y
CUDA support will be enabled for TensorFlow.

Please specify the CUDA SDK version you want to use. [Leave empty to default to CUDA 10.0]:


Please specify the location where CUDA 10.0 toolkit is installed. Refer to README.md for more details. [Default is /usr/local/cuda]:/usr/local/cuda-10.0


Please specify the cuDNN version you want to use. [Leave empty to default to cuDNN 7]:7.4


Please specify the location where cuDNN 7 library is installed. Refer to README.md for more details. [Default is /usr/local/cuda]:/usr/lib/aarch64-linux-gnu


Do you wish to build TensorFlow with TensorRT support? [y/N]: y
TensorRT support will be enabled for TensorFlow.

Please specify the location where TensorRT is installed. [Default is /usr/lib/aarch64-linux-gnu]:


Please specify the locally installed NCCL version you want to use. [Default is to use https://github.com/nvidia/nccl]:


Please specify a list of comma-separated Cuda compute capabilities you want to build with.
You can find the compute capability of your device at: https://developer.nvidia.com/cuda-gpus.
Please note that each additional compute capability significantly increases your build time and binary size. [Default is: 3.5,7.0]: 7.2

    
Do you want to use clang as CUDA compiler? [y/N]: n
nvcc will be used as CUDA compiler. 
    
Please specify which gcc should be used by nvcc as the host compiler. [Default is /usr/bin/gcc]:/usr/bin/gcc-5


Do you wish to build TensorFlow with MPI support? [y/N]: n
No MPI support will be enabled for TensorFlow.

Please specify optimization flags to use during compilation when bazel option "--config=opt" is specified [Default is -march=native -Wno-sign-compare]:


Would you like to interactively configure ./WORKSPACE for Android builds? [y/N]: n
Not configuring the WORKSPACE for Android builds.

Preconfigured Bazel build configs. You can use any of the below by adding "--config=<>" to your build command. See .bazelrc for more details.
        --config=mkl            # Build with MKL support.
        --config=monolithic     # Config for mostly static monolithic build.
        --config=gdr            # Build with GDR support.
        --config=verbs          # Build with libverbs support.
        --config=ngraph         # Build with Intel nGraph support.
        --config=dynamic_kernels        # (Experimental) Build kernels into separate shared objects.
Preconfigured Bazel build configs to DISABLE default on features:
        --config=noaws          # Disable AWS S3 filesystem support.
        --config=nogcp          # Disable GCP support.
        --config=nohdfs         # Disable HDFS support.
        --config=noignite       # Disable Apacha Ignite support.
        --config=nokafka        # Disable Apache Kafka support.
        --config=nonccl         # Disable NVIDIA NCCL support.
Configuration finished

Now that you have configured your system build, let start building:

bazel build --config=opt --config=nonccl //tensorflow/tools/pip_package:build_pip_package --incompatible_remove_native_http_archive=false --verbose_failures --cxxopt="-D_GLIBCXX_USE_CXX11_ABI=0"

After the build was hopefully finished after 4.5 hours, you can now use it to build the package:

sudo bazel-bin/tensorflow/tools/pip_package/build_pip_package ../

Install the wheel file generated

sudo pip install ../tensorflow-1.13.1-cp36-cp36m-linux_aarch64.whl

Test!

Testing the python package

$ cd 
$ python3
Python 3.6.7 (default, Oct 22 2018, 11:32:17) 
[GCC 8.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow as tf
>>> sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))
2019-06-05 15:16:56.295371: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:965] ARM64 does not support NUMA - returning NUMA node zero
2019-06-05 15:16:56.295657: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1433] Found device 0 with properties: 
name: Xavier major: 7 minor: 2 memoryClockRate(GHz): 1.5
pciBusID: 0000:00:00.0
totalMemory: 15.33GiB freeMemory: 9.03GiB
2019-06-05 15:16:56.295785: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1512] Adding visible gpu devices: 0
2019-06-05 15:16:57.766675: I tensorflow/core/common_runtime/gpu/gpu_device.cc:984] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-06-05 15:16:57.766865: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990]      0 
2019-06-05 15:16:57.766933: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 0:   N 
2019-06-05 15:16:57.767368: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 8442 MB memory) -> physical GPU (device: 0, name: Xavier, pci bus id: 0000:00:00.0, compute capability: 7.2)
Device mapping:
/job:localhost/replica:0/task:0/device:GPU:0 -> device: 0, name: Xavier, pci bus id: 0000:00:00.0, compute capability: 7.2
2019-06-05 15:16:57.769918: I tensorflow/core/common_runtime/direct_session.cc:317] Device mapping:
/job:localhost/replica:0/task:0/device:GPU:0 -> device: 0, name: Xavier, pci bus id: 0000:00:00.0, compute capability: 7.2

Testing C++:
Using the example provided here, simply follow the instruction and build using the command provided there [you don’t need to run ./configure again though]. Then test your app

$ ./tensorflow/bazel-out/aarch64-opt/bin/tensorflow/cc/example/example
2019-06-05 17:49:16.518159: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:965] ARM64 does not support NUMA - returning NUMA node zero
2019-06-05 17:49:16.518515: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1433] Found device 0 with properties: 
name: Xavier major: 7 minor: 2 memoryClockRate(GHz): 1.5
pciBusID: 0000:00:00.0
totalMemory: 15.33GiB freeMemory: 7.92GiB
2019-06-05 17:49:16.518595: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1512] Adding visible gpu devices: 0
2019-06-05 17:49:16.519415: I tensorflow/core/common_runtime/gpu/gpu_device.cc:984] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-06-05 17:49:16.519504: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990]      0 
2019-06-05 17:49:16.519589: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 0:   N 
2019-06-05 17:49:16.520143: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 7700 MB memory) -> physical GPU (device: 0, name: Xavier, pci bus id: 0000:00:00.0, compute capability: 7.2)
2019-06-05 17:49:18.887245: I tensorflow/cc/example/example.cc:22] 19
-3

So you get the expected results: 19 -3.

EDIT:
Please see the next post for a better approach when using C++ APIs.

Refrences
https://docs.bazel.build/versions/master/install-compile-source.html#bootstrap-bazel
https://www.tensorflow.org/install/source
https://github.com/tensorflow/tensorflow/issues/25323
https://devtalk.nvidia.com/default/topic/1049100/tensorflow-installation-on-drive-px2-/
https://devtalk.nvidia.com/default/topic/1043026/jetson-agx-xavier/building-tensorflow-whl-from-source-for-jetson-agx-solved-/
https://www.tensorflow.org/guide/extend/cc

Hi,

I followed the same steps and tried compiling on TX2 but failed with the following log:

INFO: From ProtoCompile tensorflow/core/protobuf/replay_log.pb.cc:
bazel-out/aarch64-opt/genfiles/external/protobuf_archive/src: warning: directory does not exist.
bazel-out/aarch64-opt/genfiles/external/protobuf_archive/src: warning: directory does not exist.
bazel-out/aarch64-opt/genfiles/external/protobuf_archive/src: warning: directory does not exist.
bazel-out/aarch64-opt/genfiles/external/protobuf_archive/src: warning: directory does not exist.
bazel-out/aarch64-opt/genfiles/external/protobuf_archive/src: warning: directory does not exist.
bazel-out/aarch64-opt/genfiles/external/protobuf_archive/src: warning: directory does not exist.
bazel-out/aarch64-opt/genfiles/external/protobuf_archive/src: warning: directory does not exist.
bazel-out/aarch64-opt/genfiles/external/protobuf_archive/src: warning: directory does not exist.
tensorflow/core/protobuf/replay_log.proto: warning: Import tensorflow/core/protobuf/cluster.proto but not used.
tensorflow/core/protobuf/replay_log.proto: warning: Import tensorflow/core/framework/graph.proto but not used.
ERROR: /home/nvidia/Documents/tensorflow/tensorflow/python/BUILD:4057:1: Linking of rule ‘//tensorflow/python:_pywrap_tensorflow_internal.so’ failed (Exit 1): crosstool_wrapper_driver_is_not_gcc failed: error executing command
(cd /home/nvidia/.cache/bazel/_bazel_nvidia/38f510ce87073e6e7989e37d45036c24/execroot/org_tensorflow &&
exec env -
LD_LIBRARY_PATH=/usr/local/cuda-10.0/lib64:
PATH=/usr/local/cuda-10.0/bin:/home/nvidia/.local/bin:/home/nvidia/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin
PWD=/proc/self/cwd
external/local_config_cuda/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc -shared -o bazel-out/host/bin/tensorflow/python/_pywrap_tensorflow_internal.so ‘-Wl,-rpath,$ORIGIN/…/…/_solib_local/_U_S_Stensorflow_Spython_C_Upywrap_Utensorflow_Uinternal.so___Utensorflow’ ‘-Wl,-rpath,$ORIGIN/…/…/_solib_local/_U@local_Uconfig_Ucuda_S_Scuda_Ccublas___Uexternal_Slocal_Uconfig_Ucuda_Scuda_Scuda_Slib’ ‘-Wl,-rpath,$ORIGIN/…/…/_solib_local/_U@local_Uconfig_Ucuda_S_Scuda_Ccusolver___Uexternal_Slocal_Uconfig_Ucuda_Scuda_Scuda_Slib’ ‘-Wl,-rpath,$ORIGIN/…/…/_solib_local/_U@local_Uconfig_Ucuda_S_Scuda_Ccudart___Uexternal_Slocal_Uconfig_Ucuda_Scuda_Scuda_Slib’ -Lbazel-out/host/bin/_solib_local/_U_S_Stensorflow_Spython_C_Upywrap_Utensorflow_Uinternal.so___Utensorflow -Lbazel-out/host/bin/_solib_local/_U@local_Uconfig_Ucuda_S_Scuda_Ccublas___Uexternal_Slocal_Uconfig_Ucuda_Scuda_Scuda_Slib -Lbazel-out/host/bin/_solib_local/_U@local_Uconfig_Ucuda_S_Scuda_Ccusolver___Uexternal_Slocal_Uconfig_Ucuda_Scuda_Scuda_Slib -Lbazel-out/host/bin/_solib_local/_U@local_Uconfig_Ucuda_S_Scuda_Ccudart___Uexternal_Slocal_Uconfig_Ucuda_Scuda_Scuda_Slib -Wl,–version-script bazel-out/host/bin/tensorflow/python/pywrap_tensorflow_internal_versionscript.lds ‘-Wl,-rpath,$ORIGIN/,-rpath,$ORIGIN/…’ -Wl,-soname,_pywrap_tensorflow_internal.so -Wl,-z,muldefs -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -Wl,-rpath,…/local_config_cuda/cuda/lib64 -Wl,-rpath,…/local_config_cuda/cuda/extras/CUPTI/lib64 -Wl,-S -Wl,-no-as-needed -Wl,-z,relro,-z,now ‘-Wl,–build-id=md5’ ‘-Wl,–hash-style=gnu’ -no-canonical-prefixes -fno-canonical-system-headers -B/usr/bin -Wl,–gc-sections -Wl,@bazel-out/host/bin/tensorflow/python/_pywrap_tensorflow_internal.so-2.params)
collect2: error: ld returned 1 exit status
Target //tensorflow/tools/pip_package:build_pip_package failed to build
INFO: Elapsed time: 15305.330s, Critical Path: 726.77s, Remote (0.00% of the time): [queue: 0.00%, setup: 0.00%, process: 0.00%]
INFO: 7964 processes: 7964 local.
FAILED: Build did NOT complete successfully

Can you help?

andy.christianson · June 19, 2019, 8:01pm

Thanks for sharing your build notes.

I was able to take some of the ideas from here, the FloopCZ/tensorflow_cc project, and get it working. The latest 2x tag of TF has all the patches needed for aarch64, so none of that is needed if you use that tag.

I put everything into a build script available here if anyone is interested: GitHub - achristianson/tfcc-jetson: Build TensorFlow (tensorflow_cc) for NVIDIA® Jetson™ systems (this is for the C++ library only --no Python)

Note that I have only tested this on the Jetson Nano. For it to build on the Nano I needed to plug in a USB flash drive and configure it as swap. I saw swap usage go up as high as ~8GB during the build, so it seems to really need it.

abhijeet.v · June 21, 2019, 5:49am

Hi,

Thank you for your response. Will certainly try the same. I have tried building the Tensorflow’s SO and it is successful as well. However when I integrate my module which loads a graph from a pb file, it fails. Sharing the BUILD files and the process along-with.

My module :- residing in TF_ROOT_DIR/tensorflow/my_module/
*contains - BUILD.txt and my_module.cpp

BUILD.txt

cc_binary(
name = “my_module”,
srcs = glob([“.cpp"]) + glob([".h”]),
linkopts = [ “-Wl,–version-script=tensorflow/tf_version_script.lds”,],
deps = [
“//tensorflow/core:tensorflow”,
],
copts = [“-I/usr/local/include/”, “-O3”],
visibility = [“//visibility:public”]
)

my_module.cpp

include “tensorflow/core/public/session.h”
include “tensorflow/core/platform/env.h”
include “tensorflow/core/lib/io/path.h”
include
include
include
using namespace std;

int main()
{
std::unique_ptrtensorflow::Session* session;

string pb_file = "/home/nvidia/Documents/test.pb";

  tensorflow::GraphDef graph_def;
  ReadBinaryProto(tensorflow::Env::Default(), pb_file, &graph_def);
  auto options = tensorflow::SessionOptions();

  		options.config.mutable_gpu_options()->set_per_process_gpu_memory_fraction(0.2);

  session->reset(tensorflow::NewSession(options)); 
  (*session)->Create(graph_def);
std::cout<< "Load graph successful..";

}

build command
bazel build --config=opt --config=nonccl --local_resources 2048,1.0,1.0 //tensorflow/my_module:my_module --incompatible_remove_native_http_archive=false --verbose_failures --cxxopt=“-D_GLIBCXX_USE_CXX11_ABI=0”

I am using Bazel 0.21

Thank you

michael5ltw3 · June 22, 2019, 12:08pm

Swap usage is going up because too many compile processes are running in parallel when using Bazel. One for each core which is too much for a tiny computer. That’s also why the mouse stops moving. Good thing is you don’t need Bazel as TensorFlow contains a script that apparently can also be used for static builds (a feature Bazel doesn’t provide yet and that our OSS project needs, at least in the future):

[url]https://github.com/tensorflow/tensorflow/tree/v1.13.1/tensorflow/contrib/makefile[/url]

You can run it with

env JOB_COUNT=1 ./tensorflow/contrib/makefile/build_all_linux.sh

So only core is used and load doesn’t get to infinity and the mouse keeps on moving. Also you don’t need swap.

Building Bazel fails on my Jetson Nano in all versions I tested, even though I followed the advice in this thread. This is day 5 of trying to build TensorFlow for C on the Jetson. If you’re interested in details:

[url]https://github.com/photoprism/photoprism/issues/83[/url]

[url]https://github.com/photoprism/photoprism/issues/99[/url]

The closest we got was with the original TensorFlow 1.13.1 makefile and the pre-installed, default GCC 7.4.0 - except that it has a bug that was fixed in GCC 8.3.1. That fix can be back ported to 7.4, but you have to compile GCC on your own which takes additional 6 hours from what I know and we don’t want to spend the weekend in front of a computer when the sun is shining, so I don’t know for sure that it works:

[url]Invalid Bug ID

[url]https://github.com/tensorflow/tensorflow/issues/25323[/url]

Let’s stop wasting time with workarounds and complicated howtos and get this compiler working plus NVIDIA should offer a compiled version of at least the shared library tensorflow.so for download so that developers buying their hardware don’t need to spend days and weeks compiling it on their own. If hosting is too expensive, we can put on our server for free download. My email: michael@photoprism.org Thank you! :)

michael5ltw3 · June 24, 2019, 7:20pm

Bazel and the latest libtensorflow with full GPU support for the Jetson Nano are now available for download at

[url]https://dl.photoprism.org/tensorflow/[/url]

The TX and AGX probably need slightly different settings. Let me know if I can help.

Compiling took 12 hours. GCC 4.8 is the only compiler that works reliably. There are other dependencies like Java and Python required in specific versions. Also you need to modify tensorflow, environment variables and the build config manually. We’ll publish more details soon.

abhijeet.v · June 25, 2019, 5:53am

So I figured out it is not feasible to use bazel to compile the C++ test code mentioned above. Inspired by this website, what we can do is use the shared libraries.

To use libtensorflow_cc for our C++ codes, we should build the shared library (only once):

bazel build --config=opt --config=nonccl //tensorflow:libtensorflow_cc.so --incompatible_remove_native_http_archive=false --verbose_failures --cxxopt="-D_GLIBCXX_USE_CXX11_ABI=0"

Then let’s say the project files are in a directory called project residing besides the root directory of tensorflow:

cd ../project

Then we are going to compile our project

g++-5 -std=gnu++11 -c ./main.cpp -D_GLIBCXX_USE_CXX11_ABI=0 \
    -I../tensorflow \
    -I../tensorflow/bazel-tensorflow/external/eigen_archive \
    -I../tensorflow/bazel-tensorflow/external/protobuf_archive/src \
    -I../tensorflow/bazel-tensorflow/external/com_google_absl \
    -I../tensorflow/bazel-genfiles

And let’s add the shared libraries to our library path

export LD_LIBRARY_PATH=../tensorflow/bazel-bin/tensorflow:$LD_LIBRARY_PATH

Now we can safely link

g++-5 -std=gnu++11 ./main.o -o main -L../tensorflow/bazel-bin/tensorflow -ltensorflow_cc -ltensorflow_framework

And run!

$ ./main 
2019-06-05 22:59:12.912357: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:965] ARM64 does not support NUMA - returning NUMA node zero
2019-06-05 22:59:12.912740: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1433] Found device 0 with properties: 
name: Xavier major: 7 minor: 2 memoryClockRate(GHz): 1.5
pciBusID: 0000:00:00.0
totalMemory: 15.33GiB freeMemory: 4.66GiB
2019-06-05 22:59:12.912937: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1512] Adding visible gpu devices: 0
2019-06-05 22:59:14.301438: I tensorflow/core/common_runtime/gpu/gpu_device.cc:984] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-06-05 22:59:14.301623: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990]      0 
2019-06-05 22:59:14.301721: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 0:   N 
2019-06-05 22:59:14.302247: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 4207 MB memory) -> physical GPU (device: 0, name: Xavier, pci bus id: 0000:00:00.0, compute capability: 7.2)
2019-06-05 22:59:15.386130: I ./main.cpp:22] 19
-3

This way, the recompilation time when developing the codes won’t take tremendous time as in case of using bazel.

Hi,

I tried building the .so file but giving the dependencies using
-I…/tensorflow/bazel-tensorflow/external/eigen_archive
-I…/tensorflow/bazel-tensorflow/external/protobuf_archive/src
-I…/tensorflow/bazel-tensorflow/external/com_google_absl
gives no such file or directory for all the three external dependencies even though the folder with sym links exists.

Am i missing something?

Thank you

r7vme · June 25, 2019, 6:29am

Hi, just in case anyone going to use TensorFlow C++ with ROS, you have to compile with (or without this flag completely)

--cxxopt="-D_GLIBCXX_USE_CXX11_ABI=1"

Problem desribed here.

abhijeet.v · June 25, 2019, 8:10am

Would really appreciate if you could help for the Jetson Tx2 flashed with Jetpack 4.2.

Thank you

michael5ltw3 · June 25, 2019, 4:47pm

Our changes are now waiting for approval:

[url]https://github.com/tensorflow/tensorflow/pull/30136[/url]

There already was a similar PR but they’ve just merged this in TF 2.0 beta for some reason…

@abhijeet.v: You’ll need the changes in this PR, Bazel 0.24.1 and a config similar to this one: https://dl.photoprism.org/tensorflow/tf_configure.bazelrc

Note that you have to change the CUDA architecture version depending on your GPU. NVIDIA docs will tell you the right version.

If you build upon our work for a commercial project or need additional support (email: michael@photoprism.org), it would be really nice if you can donate for our open-source project… I’ve invested 7 days in getting this compiled correctly (and reading a lot of boring documentation):

[url]https://github.com/photoprism/photoprism/blob/develop/SPONSORS.md[/url]

abhijeet.v · June 27, 2019, 12:05pm

Hello All,

I was struggling a lot building tensorflow on Jetson Xavier and I couldn’t find a working script which would guide through everything so I searched a lot and tried different things for days and finally was successful to build it from source. So I am going to share what I did here and hopefully it helps people who want to do the same in future. I have tried to specify all the steps I have done but I might have forgotten few things so please feel free to add anything related which improves the approach here.

System Setup

Product: Jetson AGX Xavier
JetPack: 4.2
TensorFlow: 1.13
Cuda: 10.0
Compute Capability: 7.2
Cudnn: 7.4
TensorRT: 5.0.6
Python: 3.6
bazel: 0.19.2
gcc used for building: 5.5.0
pip: 1.19.1

Building bazel

Install java if you haven’t already done so

sudo apt-get install openjdk-8-jdk

Download dist release 0.19.2 of bazel (bazel-0.19.2-dist.zip) from bazel’s build website
Unpack the downloaded file

unzip bazel-0.19.2-dist.zip

cd to the unzipped directory and build bazel

env EXTRA_BAZEL_ARGS="--host_javabase=@local_jdk//:jdk" bash ./compile.sh.

The output should be produced in output/bazel. Feel free to add this binary to your environment, i.e. ~/.bashrc:

vim ~/.bashrc
export PATH=/pathToYourBazelDirectory/output${PATH:+:${PATH}} # add this at the end of your file

Building Tensorflow

Download sources

git clone https://github.com/tensorflow/tensorflow.git
cd tensorflow

Checkout tensorflow version

git checkout r1.13

apply this patch for our arm architecture

diff --git a/tensorflow/lite/kernels/internal/BUILD b/tensorflow/lite/kernels/internal/BUILD
index 4be3226938..7226f96fdf 100644
--- a/tensorflow/lite/kernels/internal/BUILD
+++ b/tensorflow/lite/kernels/internal/BUILD
@@ -22,15 +22,12 @@ HARD_FP_FLAGS_IF_APPLICABLE = select({
 NEON_FLAGS_IF_APPLICABLE = select({
     ":arm": [
         "-O3",
-        "-mfpu=neon",
     ],
     ":armeabi-v7a": [
         "-O3",
-        "-mfpu=neon",
     ],
     ":armv7a": [
         "-O3",
-        "-mfpu=neon",
     ],
     "//conditions:default": [
         "-O3",

diff --git a/third_party/aws/BUILD.bazel b/third_party/aws/BUILD.bazel
index 5426f79e46..e08f8fc108 100644
--- a/third_party/aws/BUILD.bazel
+++ b/third_party/aws/BUILD.bazel
@@ -24,7 +24,7 @@ cc_library(
         "@org_tensorflow//tensorflow:raspberry_pi_armeabi": glob([
             "aws-cpp-sdk-core/source/platform/linux-shared/*.cpp",
         ]),
-        "//conditions:default": [],
+        "//conditions:default": glob(["aws-cpp-sdk-core/source/platform/linux-shared/*.cpp",]),
     }) + glob([
         "aws-cpp-sdk-core/include/**/*.h",
         "aws-cpp-sdk-core/source/*.cpp",

diff --git a/third_party/gpus/crosstool/BUILD.tpl b/third_party/gpus/crosstool/BUILD.tpl
index db76306ffb..184cd35b87 100644
--- a/third_party/gpus/crosstool/BUILD.tpl
+++ b/third_party/gpus/crosstool/BUILD.tpl
@@ -24,6 +24,7 @@ cc_toolchain_suite(
         "x64_windows|msvc-cl": ":cc-compiler-windows",
         "x64_windows": ":cc-compiler-windows",
         "arm": ":cc-compiler-local",
+        "aarch64": ":cc-compiler-local",
         "k8": ":cc-compiler-local",
         "piii": ":cc-compiler-local",
         "ppc": ":cc-compiler-local",

Install older gcc:

sudo apt-get install g++-5
sudo apt-get install gcc-5

Note: the problem with gcc is that it didn’t work for the default 7.4. It also didn’t work for 4.8 or 8. This is the only version I could finally build with.

Create Swap

$ fallocate -l 8G swapfile
$ ls -lh swapfile
$ sudo chmod 600 swapfile
$ ls -lh swapfile
$ sudo mkswap swapfile
$ sudo swapon swapfile
$ swapon -s

Configure system build

$./configure
Please specify the location of python. [Default is /usr/bin/python]:/usr/bin/python3


Found possible Python library paths:
  /usr/local/lib/python3.6/dist-packages
  /usr/lib/python3.6/dist-packages
Please input the desired Python library path to use.  Default is [/usr/local/lib/python3.6/dist-packages]

Do you wish to build TensorFlow with XLA JIT support? [Y/n]: n
No XLA JIT support will be enabled for TensorFlow.

Do you wish to build TensorFlow with OpenCL SYCL support? [y/N]: n
No OpenCL SYCL support will be enabled for TensorFlow.

Do you wish to build TensorFlow with ROCm support? [y/N]: n
No ROCm support will be enabled for TensorFlow.

Do you wish to build TensorFlow with CUDA support? [y/N]: y
CUDA support will be enabled for TensorFlow.

Please specify the CUDA SDK version you want to use. [Leave empty to default to CUDA 10.0]:


Please specify the location where CUDA 10.0 toolkit is installed. Refer to README.md for more details. [Default is /usr/local/cuda]:/usr/local/cuda-10.0


Please specify the cuDNN version you want to use. [Leave empty to default to cuDNN 7]:7.4


Please specify the location where cuDNN 7 library is installed. Refer to README.md for more details. [Default is /usr/local/cuda]:/usr/lib/aarch64-linux-gnu


Do you wish to build TensorFlow with TensorRT support? [y/N]: y
TensorRT support will be enabled for TensorFlow.

Please specify the location where TensorRT is installed. [Default is /usr/lib/aarch64-linux-gnu]:


Please specify the locally installed NCCL version you want to use. [Default is to use https://github.com/nvidia/nccl]:


Please specify a list of comma-separated Cuda compute capabilities you want to build with.
You can find the compute capability of your device at: https://developer.nvidia.com/cuda-gpus.
Please note that each additional compute capability significantly increases your build time and binary size. [Default is: 3.5,7.0]: 7.2

    
Do you want to use clang as CUDA compiler? [y/N]: n
nvcc will be used as CUDA compiler. 
    
Please specify which gcc should be used by nvcc as the host compiler. [Default is /usr/bin/gcc]:/usr/bin/gcc-5


Do you wish to build TensorFlow with MPI support? [y/N]: n
No MPI support will be enabled for TensorFlow.

Please specify optimization flags to use during compilation when bazel option "--config=opt" is specified [Default is -march=native -Wno-sign-compare]:


Would you like to interactively configure ./WORKSPACE for Android builds? [y/N]: n
Not configuring the WORKSPACE for Android builds.

Preconfigured Bazel build configs. You can use any of the below by adding "--config=<>" to your build command. See .bazelrc for more details.
        --config=mkl            # Build with MKL support.
        --config=monolithic     # Config for mostly static monolithic build.
        --config=gdr            # Build with GDR support.
        --config=verbs          # Build with libverbs support.
        --config=ngraph         # Build with Intel nGraph support.
        --config=dynamic_kernels        # (Experimental) Build kernels into separate shared objects.
Preconfigured Bazel build configs to DISABLE default on features:
        --config=noaws          # Disable AWS S3 filesystem support.
        --config=nogcp          # Disable GCP support.
        --config=nohdfs         # Disable HDFS support.
        --config=noignite       # Disable Apacha Ignite support.
        --config=nokafka        # Disable Apache Kafka support.
        --config=nonccl         # Disable NVIDIA NCCL support.
Configuration finished

Now that you have configured your system build, let start building:

bazel build --config=opt --config=nonccl //tensorflow/tools/pip_package:build_pip_package --incompatible_remove_native_http_archive=false --verbose_failures --cxxopt="-D_GLIBCXX_USE_CXX11_ABI=0"

After the build was hopefully finished after 4.5 hours, you can now use it to build the package:

sudo bazel-bin/tensorflow/tools/pip_package/build_pip_package ../

Install the wheel file generated

sudo pip install ../tensorflow-1.13.1-cp36-cp36m-linux_aarch64.whl

Test!

Testing the python package

$ cd 
$ python3
Python 3.6.7 (default, Oct 22 2018, 11:32:17) 
[GCC 8.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow as tf
>>> sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))
2019-06-05 15:16:56.295371: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:965] ARM64 does not support NUMA - returning NUMA node zero
2019-06-05 15:16:56.295657: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1433] Found device 0 with properties: 
name: Xavier major: 7 minor: 2 memoryClockRate(GHz): 1.5
pciBusID: 0000:00:00.0
totalMemory: 15.33GiB freeMemory: 9.03GiB
2019-06-05 15:16:56.295785: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1512] Adding visible gpu devices: 0
2019-06-05 15:16:57.766675: I tensorflow/core/common_runtime/gpu/gpu_device.cc:984] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-06-05 15:16:57.766865: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990]      0 
2019-06-05 15:16:57.766933: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 0:   N 
2019-06-05 15:16:57.767368: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 8442 MB memory) -> physical GPU (device: 0, name: Xavier, pci bus id: 0000:00:00.0, compute capability: 7.2)
Device mapping:
/job:localhost/replica:0/task:0/device:GPU:0 -> device: 0, name: Xavier, pci bus id: 0000:00:00.0, compute capability: 7.2
2019-06-05 15:16:57.769918: I tensorflow/core/common_runtime/direct_session.cc:317] Device mapping:
/job:localhost/replica:0/task:0/device:GPU:0 -> device: 0, name: Xavier, pci bus id: 0000:00:00.0, compute capability: 7.2

Testing C++:
Using the example provided here, simply follow the instruction and build using the command provided there [you don’t need to run ./configure again though]. Then test your app

$ ./tensorflow/bazel-out/aarch64-opt/bin/tensorflow/cc/example/example
2019-06-05 17:49:16.518159: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:965] ARM64 does not support NUMA - returning NUMA node zero
2019-06-05 17:49:16.518515: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1433] Found device 0 with properties: 
name: Xavier major: 7 minor: 2 memoryClockRate(GHz): 1.5
pciBusID: 0000:00:00.0
totalMemory: 15.33GiB freeMemory: 7.92GiB
2019-06-05 17:49:16.518595: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1512] Adding visible gpu devices: 0
2019-06-05 17:49:16.519415: I tensorflow/core/common_runtime/gpu/gpu_device.cc:984] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-06-05 17:49:16.519504: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990]      0 
2019-06-05 17:49:16.519589: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 0:   N 
2019-06-05 17:49:16.520143: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 7700 MB memory) -> physical GPU (device: 0, name: Xavier, pci bus id: 0000:00:00.0, compute capability: 7.2)
2019-06-05 17:49:18.887245: I tensorflow/cc/example/example.cc:22] 19
-3

So you get the expected results: 19 -3.

EDIT:
Please see the next post for a better approach when using C++ APIs.

Refrences
https://docs.bazel.build/versions/master/install-compile-source.html#bootstrap-bazel
https://www.tensorflow.org/install/source
https://github.com/tensorflow/tensorflow/issues/25323
https://devtalk.nvidia.com/default/topic/1049100/tensorflow-installation-on-drive-px2-/
https://devtalk.nvidia.com/default/topic/1043026/jetson-agx-xavier/building-tensorflow-whl-from-source-for-jetson-agx-solved-/
https://www.tensorflow.org/guide/extend/cc

Hi,

I have successfully built the wheel file and Tensorflow’s .SO on Jetson TX2 using this approach. However, checking out r1.13 branch doesn’t work!! Please use v1.13.1 tag and apply the patches.

Thank you. :-)

abhijeet.v · June 27, 2019, 12:06pm

Our changes are now waiting for approval:

https://github.com/tensorflow/tensorflow/pull/30136

There already was a similar PR but they’ve just merged this in TF 2.0 beta for some reason…

@abhijeet.v: You’ll need the changes in this PR, Bazel 0.24.1 and a config similar to this one: https://dl.photoprism.org/tensorflow/tf_configure.bazelrc

Note that you have to change the CUDA architecture version depending on your GPU. NVIDIA docs will tell you the right version.

If you build upon our work for a commercial project or need additional support (email: michael@photoprism.org), it would be really nice if you can donate for our open-source project… I’ve invested 7 days in getting this compiled correctly (and reading a lot of boring documentation):

https://github.com/photoprism/photoprism/blob/develop/SPONSORS.md

Hi,

Thank you for your response. I am successful in building the pip package and .SO for TF v1.13.1 on Jetson TX2 after around 3 weeks!!

Thank you

jonathanhtxp3 · August 2, 2019, 7:49am

@michael5ltw3 any reason to not include tensorflow/c/tf_attrtype.h in your build?

After adding this header, and setting up the correct symlinks, I was able to compile/symlink with your library on a Jetson TX2.

lrwxrwxrwx 1 build build   28 Aug  2 15:58 libtensorflow_framework.so -> libtensorflow_framework.so.1
lrwxrwxrwx 1 build build   33 Aug  2 15:58 libtensorflow_framework.so.1 -> libtensorflow_framework.so.1.14.0
-r-xr-xr-x 1 build build  34M Aug  2 14:10 libtensorflow_framework.so.1.14.0
lrwxrwxrwx 1 build build   18 Aug  2 15:58 libtensorflow.so -> libtensorflow.so.1
lrwxrwxrwx 1 build build   23 Aug  2 15:58 libtensorflow.so.1 -> libtensorflow.so.1.14.0
-r-xr-xr-x 1 build build 274M Aug  2 14:10 libtensorflow.so.1.14.0

naman.rawal · November 6, 2019, 3:43pm

Hi,

I followed the same steps and tried compiling on TX2 but failed with the following log:

INFO: From ProtoCompile tensorflow/core/protobuf/replay_log.pb.cc:
bazel-out/aarch64-opt/genfiles/external/protobuf_archive/src: warning: directory does not exist.
bazel-out/aarch64-opt/genfiles/external/protobuf_archive/src: warning: directory does not exist.
bazel-out/aarch64-opt/genfiles/external/protobuf_archive/src: warning: directory does not exist.
bazel-out/aarch64-opt/genfiles/external/protobuf_archive/src: warning: directory does not exist.
bazel-out/aarch64-opt/genfiles/external/protobuf_archive/src: warning: directory does not exist.
bazel-out/aarch64-opt/genfiles/external/protobuf_archive/src: warning: directory does not exist.
bazel-out/aarch64-opt/genfiles/external/protobuf_archive/src: warning: directory does not exist.
bazel-out/aarch64-opt/genfiles/external/protobuf_archive/src: warning: directory does not exist.
tensorflow/core/protobuf/replay_log.proto: warning: Import tensorflow/core/protobuf/cluster.proto but not used.
tensorflow/core/protobuf/replay_log.proto: warning: Import tensorflow/core/framework/graph.proto but not used.
ERROR: /home/nvidia/Documents/tensorflow/tensorflow/python/BUILD:4057:1: Linking of rule ‘//tensorflow/python:_pywrap_tensorflow_internal.so’ failed (Exit 1): crosstool_wrapper_driver_is_not_gcc failed: error executing command
(cd /home/nvidia/.cache/bazel/_bazel_nvidia/38f510ce87073e6e7989e37d45036c24/execroot/org_tensorflow &&
exec env -
LD_LIBRARY_PATH=/usr/local/cuda-10.0/lib64:
PATH=/usr/local/cuda-10.0/bin:/home/nvidia/.local/bin:/home/nvidia/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin
PWD=/proc/self/cwd
external/local_config_cuda/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc -shared -o bazel-out/host/bin/tensorflow/python/_pywrap_tensorflow_internal.so ‘-Wl,-rpath,$ORIGIN/…/…/_solib_local/_U_S_Stensorflow_Spython_C_Upywrap_Utensorflow_Uinternal.so___Utensorflow’ ‘-Wl,-rpath,$ORIGIN/…/…/_solib_local/_U@local_Uconfig_Ucuda_S_Scuda_Ccublas___Uexternal_Slocal_Uconfig_Ucuda_Scuda_Scuda_Slib’ ‘-Wl,-rpath,$ORIGIN/…/…/_solib_local/_U@local_Uconfig_Ucuda_S_Scuda_Ccusolver___Uexternal_Slocal_Uconfig_Ucuda_Scuda_Scuda_Slib’ ‘-Wl,-rpath,$ORIGIN/…/…/_solib_local/_U@local_Uconfig_Ucuda_S_Scuda_Ccudart___Uexternal_Slocal_Uconfig_Ucuda_Scuda_Scuda_Slib’ -Lbazel-out/host/bin/_solib_local/_U_S_Stensorflow_Spython_C_Upywrap_Utensorflow_Uinternal.so___Utensorflow -Lbazel-out/host/bin/_solib_local/_U@local_Uconfig_Ucuda_S_Scuda_Ccublas___Uexternal_Slocal_Uconfig_Ucuda_Scuda_Scuda_Slib -Lbazel-out/host/bin/_solib_local/_U@local_Uconfig_Ucuda_S_Scuda_Ccusolver___Uexternal_Slocal_Uconfig_Ucuda_Scuda_Scuda_Slib -Lbazel-out/host/bin/_solib_local/_U@local_Uconfig_Ucuda_S_Scuda_Ccudart___Uexternal_Slocal_Uconfig_Ucuda_Scuda_Scuda_Slib -Wl,–version-script bazel-out/host/bin/tensorflow/python/pywrap_tensorflow_internal_versionscript.lds ‘-Wl,-rpath,$ORIGIN/,-rpath,$ORIGIN/…’ -Wl,-soname,_pywrap_tensorflow_internal.so -Wl,-z,muldefs -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -Wl,-rpath,…/local_config_cuda/cuda/lib64 -Wl,-rpath,…/local_config_cuda/cuda/extras/CUPTI/lib64 -Wl,-S -Wl,-no-as-needed -Wl,-z,relro,-z,now ‘-Wl,–build-id=md5’ ‘-Wl,–hash-style=gnu’ -no-canonical-prefixes -fno-canonical-system-headers -B/usr/bin -Wl,–gc-sections -Wl,@bazel-out/host/bin/tensorflow/python/_pywrap_tensorflow_internal.so-2.params)
collect2: error: ld returned 1 exit status
Target //tensorflow/tools/pip_package:build_pip_package failed to build
INFO: Elapsed time: 15305.330s, Critical Path: 726.77s, Remote (0.00% of the time): [queue: 0.00%, setup: 0.00%, process: 0.00%]
INFO: 7964 processes: 7964 local.
FAILED: Build did NOT complete successfully

Can you help?

I am still facing this issue. Can anyone help me with what should be done differently to resolve this issue?

AastaLLL · November 18, 2019, 8:43am

Hi,

We have provided the prebuilt package of TensorFlow v1.13 for the Jetson user.

You don’t need to build it on your own.
It can be downloaded and installed directly with the command shared here:
https://docs.nvidia.com/deeplearning/frameworks/install-tf-jetson-platform/index.html

Thanks.

douglas.l · November 25, 2019, 2:58pm

Thanks, that was helpful getting started with the Python API. I’ve also downloaded michael5ltw3’s binaries from (Building Tensorflow 1.13 on Jetson Xavier - Jetson AGX Xavier - NVIDIA Developer Forums) and these seem to work well, are there NVIDIA builds of libtensorflow.so and libtensorflow_framework.so too so that I can work with this from C?

leemanyu · February 28, 2020, 1:03pm

Hello All,

I was struggling a lot building tensorflow on Jetson Xavier and I couldn’t find a working script which would guide through everything so I searched a lot and tried different things for days and finally was successful to build it from source. So I am going to share what I did here and hopefully it helps people who want to do the same in future. I have tried to specify all the steps I have done but I might have forgotten few things so please feel free to add anything related which improves the approach here.

System Setup

Product: Jetson AGX Xavier
JetPack: 4.2
TensorFlow: 1.13
Cuda: 10.0
Compute Capability: 7.2
Cudnn: 7.4
TensorRT: 5.0.6
Python: 3.6
bazel: 0.19.2
gcc used for building: 5.5.0
pip: 1.19.1

Building bazel

Install java if you haven’t already done so

sudo apt-get install openjdk-8-jdk

Download dist release 0.19.2 of bazel (bazel-0.19.2-dist.zip) from bazel’s build website
Unpack the downloaded file

unzip bazel-0.19.2-dist.zip

cd to the unzipped directory and build bazel

env EXTRA_BAZEL_ARGS="--host_javabase=@local_jdk//:jdk" bash ./compile.sh.

The output should be produced in output/bazel. Feel free to add this binary to your environment, i.e. ~/.bashrc:

vim ~/.bashrc
export PATH=/pathToYourBazelDirectory/output${PATH:+:${PATH}} # add this at the end of your file

Building Tensorflow

Download sources

git clone https://github.com/tensorflow/tensorflow.git
cd tensorflow

Checkout tensorflow version

git checkout r1.13

apply this patch for our arm architecture

diff --git a/tensorflow/lite/kernels/internal/BUILD b/tensorflow/lite/kernels/internal/BUILD
index 4be3226938..7226f96fdf 100644
--- a/tensorflow/lite/kernels/internal/BUILD
+++ b/tensorflow/lite/kernels/internal/BUILD
@@ -22,15 +22,12 @@ HARD_FP_FLAGS_IF_APPLICABLE = select({
 NEON_FLAGS_IF_APPLICABLE = select({
     ":arm": [
         "-O3",
-        "-mfpu=neon",
     ],
     ":armeabi-v7a": [
         "-O3",
-        "-mfpu=neon",
     ],
     ":armv7a": [
         "-O3",
-        "-mfpu=neon",
     ],
     "//conditions:default": [
         "-O3",

diff --git a/third_party/aws/BUILD.bazel b/third_party/aws/BUILD.bazel
index 5426f79e46..e08f8fc108 100644
--- a/third_party/aws/BUILD.bazel
+++ b/third_party/aws/BUILD.bazel
@@ -24,7 +24,7 @@ cc_library(
         "@org_tensorflow//tensorflow:raspberry_pi_armeabi": glob([
             "aws-cpp-sdk-core/source/platform/linux-shared/*.cpp",
         ]),
-        "//conditions:default": [],
+        "//conditions:default": glob(["aws-cpp-sdk-core/source/platform/linux-shared/*.cpp",]),
     }) + glob([
         "aws-cpp-sdk-core/include/**/*.h",
         "aws-cpp-sdk-core/source/*.cpp",

diff --git a/third_party/gpus/crosstool/BUILD.tpl b/third_party/gpus/crosstool/BUILD.tpl
index db76306ffb..184cd35b87 100644
--- a/third_party/gpus/crosstool/BUILD.tpl
+++ b/third_party/gpus/crosstool/BUILD.tpl
@@ -24,6 +24,7 @@ cc_toolchain_suite(
         "x64_windows|msvc-cl": ":cc-compiler-windows",
         "x64_windows": ":cc-compiler-windows",
         "arm": ":cc-compiler-local",
+        "aarch64": ":cc-compiler-local",
         "k8": ":cc-compiler-local",
         "piii": ":cc-compiler-local",
         "ppc": ":cc-compiler-local",

Install older gcc:

sudo apt-get install g++-5
sudo apt-get install gcc-5

Note: the problem with gcc is that it didn’t work for the default 7.4. It also didn’t work for 4.8 or 8. This is the only version I could finally build with.

Create Swap

$ fallocate -l 8G swapfile
$ ls -lh swapfile
$ sudo chmod 600 swapfile
$ ls -lh swapfile
$ sudo mkswap swapfile
$ sudo swapon swapfile
$ swapon -s

Configure system build

$./configure
Please specify the location of python. [Default is /usr/bin/python]:/usr/bin/python3


Found possible Python library paths:
  /usr/local/lib/python3.6/dist-packages
  /usr/lib/python3.6/dist-packages
Please input the desired Python library path to use.  Default is [/usr/local/lib/python3.6/dist-packages]

Do you wish to build TensorFlow with XLA JIT support? [Y/n]: n
No XLA JIT support will be enabled for TensorFlow.

Do you wish to build TensorFlow with OpenCL SYCL support? [y/N]: n
No OpenCL SYCL support will be enabled for TensorFlow.

Do you wish to build TensorFlow with ROCm support? [y/N]: n
No ROCm support will be enabled for TensorFlow.

Do you wish to build TensorFlow with CUDA support? [y/N]: y
CUDA support will be enabled for TensorFlow.

Please specify the CUDA SDK version you want to use. [Leave empty to default to CUDA 10.0]:


Please specify the location where CUDA 10.0 toolkit is installed. Refer to README.md for more details. [Default is /usr/local/cuda]:/usr/local/cuda-10.0


Please specify the cuDNN version you want to use. [Leave empty to default to cuDNN 7]:7.4


Please specify the location where cuDNN 7 library is installed. Refer to README.md for more details. [Default is /usr/local/cuda]:/usr/lib/aarch64-linux-gnu


Do you wish to build TensorFlow with TensorRT support? [y/N]: y
TensorRT support will be enabled for TensorFlow.

Please specify the location where TensorRT is installed. [Default is /usr/lib/aarch64-linux-gnu]:


Please specify the locally installed NCCL version you want to use. [Default is to use https://github.com/nvidia/nccl]:


Please specify a list of comma-separated Cuda compute capabilities you want to build with.
You can find the compute capability of your device at: https://developer.nvidia.com/cuda-gpus.
Please note that each additional compute capability significantly increases your build time and binary size. [Default is: 3.5,7.0]: 7.2

    
Do you want to use clang as CUDA compiler? [y/N]: n
nvcc will be used as CUDA compiler. 
    
Please specify which gcc should be used by nvcc as the host compiler. [Default is /usr/bin/gcc]:/usr/bin/gcc-5


Do you wish to build TensorFlow with MPI support? [y/N]: n
No MPI support will be enabled for TensorFlow.

Please specify optimization flags to use during compilation when bazel option "--config=opt" is specified [Default is -march=native -Wno-sign-compare]:


Would you like to interactively configure ./WORKSPACE for Android builds? [y/N]: n
Not configuring the WORKSPACE for Android builds.

Preconfigured Bazel build configs. You can use any of the below by adding "--config=<>" to your build command. See .bazelrc for more details.
        --config=mkl            # Build with MKL support.
        --config=monolithic     # Config for mostly static monolithic build.
        --config=gdr            # Build with GDR support.
        --config=verbs          # Build with libverbs support.
        --config=ngraph         # Build with Intel nGraph support.
        --config=dynamic_kernels        # (Experimental) Build kernels into separate shared objects.
Preconfigured Bazel build configs to DISABLE default on features:
        --config=noaws          # Disable AWS S3 filesystem support.
        --config=nogcp          # Disable GCP support.
        --config=nohdfs         # Disable HDFS support.
        --config=noignite       # Disable Apacha Ignite support.
        --config=nokafka        # Disable Apache Kafka support.
        --config=nonccl         # Disable NVIDIA NCCL support.
Configuration finished

Now that you have configured your system build, let start building:

bazel build --config=opt --config=nonccl //tensorflow/tools/pip_package:build_pip_package --incompatible_remove_native_http_archive=false --verbose_failures --cxxopt="-D_GLIBCXX_USE_CXX11_ABI=0"

After the build was hopefully finished after 4.5 hours, you can now use it to build the package:

sudo bazel-bin/tensorflow/tools/pip_package/build_pip_package ../

Install the wheel file generated

sudo pip install ../tensorflow-1.13.1-cp36-cp36m-linux_aarch64.whl

Test!

Testing the python package

$ cd 
$ python3
Python 3.6.7 (default, Oct 22 2018, 11:32:17) 
[GCC 8.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow as tf
>>> sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))
2019-06-05 15:16:56.295371: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:965] ARM64 does not support NUMA - returning NUMA node zero
2019-06-05 15:16:56.295657: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1433] Found device 0 with properties: 
name: Xavier major: 7 minor: 2 memoryClockRate(GHz): 1.5
pciBusID: 0000:00:00.0
totalMemory: 15.33GiB freeMemory: 9.03GiB
2019-06-05 15:16:56.295785: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1512] Adding visible gpu devices: 0
2019-06-05 15:16:57.766675: I tensorflow/core/common_runtime/gpu/gpu_device.cc:984] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-06-05 15:16:57.766865: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990]      0 
2019-06-05 15:16:57.766933: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 0:   N 
2019-06-05 15:16:57.767368: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 8442 MB memory) -> physical GPU (device: 0, name: Xavier, pci bus id: 0000:00:00.0, compute capability: 7.2)
Device mapping:
/job:localhost/replica:0/task:0/device:GPU:0 -> device: 0, name: Xavier, pci bus id: 0000:00:00.0, compute capability: 7.2
2019-06-05 15:16:57.769918: I tensorflow/core/common_runtime/direct_session.cc:317] Device mapping:
/job:localhost/replica:0/task:0/device:GPU:0 -> device: 0, name: Xavier, pci bus id: 0000:00:00.0, compute capability: 7.2

Testing C++:
Using the example provided here, simply follow the instruction and build using the command provided there [you don’t need to run ./configure again though]. Then test your app

$ ./tensorflow/bazel-out/aarch64-opt/bin/tensorflow/cc/example/example
2019-06-05 17:49:16.518159: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:965] ARM64 does not support NUMA - returning NUMA node zero
2019-06-05 17:49:16.518515: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1433] Found device 0 with properties: 
name: Xavier major: 7 minor: 2 memoryClockRate(GHz): 1.5
pciBusID: 0000:00:00.0
totalMemory: 15.33GiB freeMemory: 7.92GiB
2019-06-05 17:49:16.518595: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1512] Adding visible gpu devices: 0
2019-06-05 17:49:16.519415: I tensorflow/core/common_runtime/gpu/gpu_device.cc:984] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-06-05 17:49:16.519504: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990]      0 
2019-06-05 17:49:16.519589: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 0:   N 
2019-06-05 17:49:16.520143: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 7700 MB memory) -> physical GPU (device: 0, name: Xavier, pci bus id: 0000:00:00.0, compute capability: 7.2)
2019-06-05 17:49:18.887245: I tensorflow/cc/example/example.cc:22] 19
-3

So you get the expected results: 19 -3.

EDIT:
Please see the next post for a better approach when using C++ APIs.

Refrences
https://docs.bazel.build/versions/master/install-compile-source.html#bootstrap-bazel
https://www.tensorflow.org/install/source
https://github.com/tensorflow/tensorflow/issues/25323
https://devtalk.nvidia.com/default/topic/1049100/tensorflow-installation-on-drive-px2-/
https://devtalk.nvidia.com/default/topic/1043026/jetson-agx-xavier/building-tensorflow-whl-from-source-for-jetson-agx-solved-/
https://www.tensorflow.org/guide/extend/cc

Hi,

I have successfully built the wheel file and Tensorflow’s .SO on Jetson TX2 using this approach. However, checking out r1.13 branch doesn’t work!! Please use v1.13.1 tag and apply the patches.

Thank you. :-)

Hi,
I’ve followed all the step in this post and use tag v1.13.1 to build tensorflow on TX2. However, i keep getting error of Cuda Configuration Error: Cannot find line containing ‘define NV_TENSORRT_SONAME_MAJOR’ in /usr/include/aarch64-linux-gnu/NvInfer.h at bazel build step.

tensorflow git:(6612da8951) ✗ bazel build --config=opt --config=nonccl //tensorflow/tools/pip_package:build_pip_package --incompatible_remove_native_http_archive=false --verbose_failures --cxxopt=“-D_GLIBCXX_USE_CXX11_ABI=0”
ERROR: Skipping ‘//tensorflow/tools/pip_package:build_pip_package’: error loading package ‘tensorflow/tools/pip_package’: Encountered error while reading extension file ‘build_defs.bzl’: no such package ‘@local_config_tensorrt//’: Traceback (most recent call last):
File “/home/lab421/tensorflow/third_party/tensorrt/tensorrt_configure.bzl”, line 171
_trt_lib_version(repository_ctx, trt_install_path)
File “/home/lab421/tensorflow/third_party/tensorrt/tensorrt_configure.bzl”, line 82, in _trt_lib_version
find_cuda_define(repository_ctx, trt_header_dir, “NvI…”, …)
File “/home/lab421/tensorflow/third_party/gpus/cuda_configure.bzl”, line 569, in find_cuda_define
auto_configure_fail((“Cannot find line containing '%…)))
File “/home/lab421/tensorflow/third_party/gpus/cuda_configure.bzl”, line 342, in auto_configure_fail
fail((”\n%sCuda Configuration Error:%…)))

Cuda Configuration Error: Cannot find line containing ‘define NV_TENSORRT_SONAME_MAJOR’ in /usr/include/aarch64-linux-gnu/NvInfer.h
WARNING: Target pattern parsing failed.
ERROR: error loading package ‘tensorflow/tools/pip_package’: Encountered error while reading extension file ‘build_defs.bzl’: no such package ‘@local_config_tensorrt//’: Traceback (most recent call last):
File “/home/lab421/tensorflow/third_party/tensorrt/tensorrt_configure.bzl”, line 171
_trt_lib_version(repository_ctx, trt_install_path)
File “/home/lab421/tensorflow/third_party/tensorrt/tensorrt_configure.bzl”, line 82, in _trt_lib_version
find_cuda_define(repository_ctx, trt_header_dir, “NvI…”, …)
File “/home/lab421/tensorflow/third_party/gpus/cuda_configure.bzl”, line 569, in find_cuda_define
auto_configure_fail((“Cannot find line containing '%…)))
File “/home/lab421/tensorflow/third_party/gpus/cuda_configure.bzl”, line 342, in auto_configure_fail
fail((”\n%sCuda Configuration Error:%…)))

Cuda Configuration Error: Cannot find line containing ‘define NV_TENSORRT_SONAME_MAJOR’ in /usr/include/aarch64-linux-gnu/NvInfer.h
INFO: Elapsed time: 0.341s
INFO: 0 processes.
FAILED: Build did NOT complete successfully (0 packages loaded)
currently loading: tensorflow/tools/pip_package

Would really appreciate if you could help.

Topic		Replies	Views
TensorFlow on Jetson TX1 Jetson TX1	14	15952	January 20, 2017
how to build the tensorflow_cc.so(tensorflow c++ interface) on the xavier Jetson AGX Xavier	15	6693	October 18, 2021
Is Tensorflow 2.0 on Jetson TX2 supported? Jetson TX2	19	4477	October 18, 2021
Installing Tensorflow 2.7.0 with Python 3.7 on Jetson Nano with JetPack 4.6.1 Jetson Nano tensorflow	6	2514	June 29, 2023
Official TensorFlow for Jetson AGX Xavier Jetson AGX Xavier kb	97	41677	September 5, 2023
TensorFlow for Jetson TX2! Jetson TX2	113	47406	September 21, 2023
TensorFlow 1.11.0 wheel with JetPack 3.3 Jetson TX2	103	45344	November 13, 2019
Request for prebuilt TensorFlow C/C++ API libs for Jetson Nano Jetson Nano	16	3706	October 14, 2021
Tensorflow 2.1 with CUDA10.2 warnings .. Frameworks tensorflow	15	17737	July 3, 2020
Problem to install tensorflow on Xavier (Solved) Jetson AGX Xavier	19	8643	October 18, 2021

Building Tensorflow 1.13 on Jetson Xavier

Related topics