Build tensorflow v2.16.1 sources on Jetson orin Jetpack 5.1 failed

Description

Follow the instruction for “Build from source” on Build from source  |  TensorFlow
with the below options to run “configure”:

  • Support cuda (version 11.8)
  • Use nvcc compiler (gcc 11.4)

Then, run the command:

$ bazel build //tensorflow/tools/pip_package:wheel --repo_env=WHEEL_NAME=tensorflow_cuda --config=cuda

Notes: bazel 6.5.0 matches the tensorflow v2.16.1

Error messages:

ERROR: /xxxx/download/tensorflow_master_031120024/tensorflow/python/framework/BUILD:2571:18: Compiling tensorflow/python/framework/test_ops.cu.cc [for tool] failed: (Exit 4): crosstool_wrapper_driver_is_not_gcc failed: error executing command (from target //tensorflow/python/framework:test_ops_kernels_gpu) external/local_config_cuda/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc -MD -MF bazel-out/aarch64-opt-exec-50AE0418/bin/tensorflow/python/framework/_objs/test_ops_kernels_gpu/test_ops.cu.d ... (remaining 190 arguments skipped)
nvcc warning : The 'compute_35', 'compute_37', 'sm_35', and 'sm_37' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
nvcc warning : The 'compute_35', 'compute_37', 'sm_35', and 'sm_37' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
/usr/lib/gcc/aarch64-linux-gnu/11/include/arm_neon.h(38): error: identifier "__Int8x8_t" is undefined
/usr/lib/gcc/aarch64-linux-gnu/11/include/arm_neon.h(39): error: identifier "__Int16x4_t" is undefine
/usr/lib/gcc/aarch64-linux-gnu/11/include/arm_neon.h(40): error: identifier "__Int32x2_t" is undefined
/usr/lib/gcc/aarch64-linux-gnu/11/include/arm_neon.h(41): error: identifier "__Int64x1_t" is undefined
/usr/lib/gcc/aarch64-linux-gnu/11/include/arm_neon.h(42): error: identifier "__Float16x4_t" is undefined
/usr/lib/gcc/aarch64-linux-gnu/11/include/arm_neon.h(43): error: identifier "__Float32x2_t" is undefined
/usr/lib/gcc/aarch64-linux-gnu/11/include/arm_neon.h(44): error: identifier "__Poly8x8_t" is undefined
/usr/lib/gcc/aarch64-linux-gnu/11/include/arm_neon.h(45): error: identifier "__Poly16x4_t" is undefined
.......

Environment

GPU Type: Jetson orin
CUDA Version: 11.8
CUDNN Version:
Operating System + Version: Ubuntu 20.04
Python Version (if applicable): 3.9
TensorFlow Version (if applicable): 2.16.1

Notes

This is an attempt to build Cuda supported Tensorflow from the source codes with python 3.9. The Jetson download is not available for python 3.9 on nv 11.8. From the error messages, it points that the data types for aarch64 exist incompatible. I believe that building jetson tensorflow (any version) at the Nvidia internal faces the same situation. I like to know what’s the resolution at Nvidia: any configuration options? a source code files of Tensorflow updated ?

Also, selecting the configuration option without cuda has the same errors.
“./configure”:

  • without cuda
  • Use nvcc or clang 17 compiler
    Run:
$ bazel build //tensorflow/tools/pip_package:wheel --repo_env=WHEEL_NAME=tensorflow_cpu

Thanks.

The issue is to turn off Arm NEON feature to build tensorflow. The Arm CPU is:

processor	: 0
model name	: ARMv8 Processor rev 1 (v8l)
BogoMIPS	: 62.50
Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm lrcpc dcpop asimddp uscat ilrcpc flagm
CPU implementer	: 0x41
CPU architecture: 8
CPU variant	: 0x0
CPU part	: 0xd42
CPU revision	: 1

Any idea to turn off NEON in the build?

Hi @Nqubits ,
I would recommend you to raise the issue on Issues · tensorflow/tensorflow · GitHub for better assistance.

thanks

The Jetson AGX orin uses Arm 8.1, which is a special arm64 platform not to be supported in general in tensorflow team and ARM tensorflow team (who can support to build the binary tensorflow for arm 8.1a). I guess the binary wheel file,

tensorflow-2.12.0+nv23.05-cp38-cp38-linux_aarch64.whl

which probably created by google tensorflow team (not Nvidia) on the Jetson AGX platform. It seems that the tensorflow team won’t have the resource to support the build issue from the source codes.

I give up here.