Tensorflow 2.7 and Jetpack 4.6.1

user100090 · March 29, 2022, 4:17pm

As discussed in this thread, NVIDIA doesn’t include the tensorflow C libs, so we have to build it ourselves from the source.

For Jetpack 4.6.1, the compatibility table says tensorflow version 2.7.0. Can I directly take the open source tensorflow 2.7.0 to build, or is there a special nvidia patched 2.7.0 that I should have? If former, since open source tensorflow recently released 2.7.1 with many security fixes, will that work even though it is not mentioned in the compatibility table?

user100090 · March 29, 2022, 8:31pm

I tried building 2.7.0 C libs, but it failed in the end:

bazel --host_jvm_args=-Xmx1g --host_jvm_args=-Xms512m \
       build -c opt \
       --config=tensorrt --config=cuda \
       --config=noaws --config=nogcp --config=nohdfs --config=nonccl \
       --python_path="/usr/bin/python3" \
       --action_env=PYTHON_BIN_PATH="/usr/bin/python3" \
       --action_env=PYTHON_LIB_PATH="/usr/lib/python3/dist-packages" \
       --action_env=CUDA_TOOLKIT_PATH="/usr/local/cuda-10.2" \
       --action_env=TF_CUDA_COMPUTE_CAPABILITIES="7.2" \
       --action_env=GCC_HOST_COMPILER_PATH="/usr/bin/aarch64-linux-gnu-gcc-7" \
       //tensorflow/tools/lib_package:libtensorflow
...
...
tensorflow/compiler/tf2tensorrt/stub/nvinfer_plugin_stub.cc:64:2: error: #error
This version of TensorRT is not supported.
 #error This version of TensorRT is not supported.
  ^~~~~
Target //tensorflow/tools/lib_package:libtensorflow failed to build
Use --verbose_failures to see the command lines of failed build steps.

AastaLLL · March 30, 2022, 3:10am

Hi,

Our source is similar to the upstreaming with some compatibility fixes.

Based on the source below, TensorFlow only supports up to the TensorRT 8.0.
Maybe you can try a newer TensorFlow version or turn off the TensorRT support.

github.com

tensorflow/tensorflow/blob/v2.7.0/tensorflow/compiler/tf2tensorrt/stub/nvinfer_plugin_stub.cc#L53


      
                  .IgnoreError();
            }
            return reinterpret_cast<T>(symbol);
          }
          
          
void LogFatalSymbolNotFound(const char* symbol_name) {
            LOG(FATAL) << symbol_name << " symbol not found.";
          }
          }  // namespace
          
          
#if NV_TENSORRT_MAJOR < 7
          #error TensorRT version earlier than 7 is not supported.
          #elif NV_TENSORRT_MAJOR == 7 && NV_TENSORRT_MINOR == 0
          #include "tensorflow/compiler/tf2tensorrt/stub/NvInferPlugin_7_1_0.inc"
          #elif NV_TENSORRT_MAJOR == 7 && NV_TENSORRT_MINOR == 1
          #include "tensorflow/compiler/tf2tensorrt/stub/NvInferPlugin_7_1.inc"
          #elif NV_TENSORRT_MAJOR == 7 && NV_TENSORRT_MINOR == 2
          #include "tensorflow/compiler/tf2tensorrt/stub/NvInferPlugin_7_2.inc"
          #elif NV_TENSORRT_MAJOR == 8 && NV_TENSORRT_MINOR == 0
          #include "tensorflow/compiler/tf2tensorrt/stub/NvInferPlugin_8_0.inc"
          #else

Thanks.

user100090 · March 30, 2022, 5:06am

Are you saying we can try upstream TF 2.8.0 even though the compatibility table says JP 4.6.1 is compatible with TF 2.7.0?

user100090 · April 5, 2022, 7:51am

I tried compiling TF 2.8.0 with JP 4.6.1, using Python 2.8 but got below errors:

...
ERROR: /var/log/tensorflow/tensorflow-2.8.0/tensorflow/python/lib/core/BUILD
:49:11: Compiling tensorflow/python/lib/core/bfloat16.cc failed: (Exit 1): cross
tool_wrapper_driver_is_not_gcc failed: error executing command external/local_co
nfig_cuda/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc -MD -MF bazel-
out/aarch64-opt/bin/tensorflow/python/lib/core/_objs/bfloat16_lib/bfloat16.pic.d
 ... (remaining 63 argument(s) skipped)
In file included from /usr/include/c++/7/math.h:36:0,
                 from bazel-out/aarch64-opt/bin/external/local_config_python/pyt
hon_include/pyport.h:194,
                 from bazel-out/aarch64-opt/bin/external/local_config_python/pyt
hon_include/Python.h:53,
                 from ./tensorflow/python/lib/core/bfloat16.h:19,
                 from tensorflow/python/lib/core/bfloat16.cc:16:
/usr/include/c++/7/cmath: In static member function 'static void tensorflow::{an
onymous}::BinaryUFunc<InType, OutType, Functor>::Call(char**, const npy_intp*, c
onst npy_intp*, void*) [with InType = Eigen::bfloat16; OutType = Eigen::bfloat16
; Functor = tensorflow::{anonymous}::ufuncs::CopySign]':
/usr/include/c++/7/cmath:1302:40: internal compiler error: Segmentation fault
   { return __builtin_copysignf(__x, __y); }
                                        ^
Please submit a full bug report,
with preprocessed source if appropriate.
See <file:///usr/share/doc/gcc-7/README.Bugs> for instructions.
Target //tensorflow/tools/pip_package:build_pip_package failed to build
Use --verbose_failures to see the command lines of failed build steps.

...

user100090 · April 5, 2022, 7:54am

I see similar errors in this thread.
Any patches? @dusty_nv

dusty_nv · April 20, 2022, 2:15pm

I haven’t compiled TensorFlow myself before, and don’t have patches for this, but typically ICE errors are a sign that you may want/need to upgrade the compiler version. Or you can manually try patching the source to avoid calling that macro.

user100090 · April 20, 2022, 5:11pm

The compiler is already the latest provided by JP 4.6.1. I don’t think there is another version we could upgrade to.

I was hoping you would have a patch to bfloat16.cc that could avoid using copysign macro.

AastaLLL · April 21, 2022, 7:00am

Hi,

Some users have tried to compile TensorFlow for Jetson before.
Would you mind checking the below repository first?

Thanks.

user100090 · April 21, 2022, 5:22pm

Thanks for the link, but that only talks about tensorflow 2.3 or lower, and the “copysign” was only added in tensorflow 2.5+.

I wondering if your patch could be used to by pass the compiler macro.

dusty_nv · April 21, 2022, 6:08pm

It would seem that’s what that patch indeed does, so you could try it and see if it works (as previously mentioned, I have only done this with PyTorch and not TensorFlow)

user100090 · April 22, 2022, 12:16am

As I mentioned above, “copysign” was added in tensorflow 2.5+, and since JP 4.6.1 supports tensorflow 2.7.0, can you check your internal codebase to see how you patched it to have 2.7.0 built properly? and provide the patch here? Thanks!

AastaLLL · April 29, 2022, 6:07am

Hi, user100090

Have you fixed this issue with the patch for PyTorch?

Thanks.

user100090 · April 29, 2022, 4:12pm

I was hoping to see how nvidia was able to compile tf 2.7 and released the python whls file. If that’s not feasible, if @dusty_nv can provide an unofficial patch since he was the one who gave the PyTorch patch.

For now, I am able to get it compile by doing below, but not sure if it is correct.

sed -i 's/std::copysign/copysign/g' \
  tensorflow/python/lib/core/bfloat16.cc
sed -i '36a static inline float copysign(float __x, float __y)\n  { return std::signbit(__y) ? -std::abs(__x) : std::abs(__x); }\n' \
  tensorflow/python/lib/core/bfloat16.cc

AastaLLL · May 3, 2022, 5:34am

Hi,

So you can compile the TensorFlow without error after applying that change?
If yes, do you meet any issues when using it?

Since the branch is maintained internally, we are not able to share the patch public.
Thanks.

user100090 · May 3, 2022, 6:17am

Yes it compiles, but I don’t have a way to do a full test. I don’t know if my application triggers that piece of code. Since your patch is provided and used by ALL jetson users, it will get more exposure and testing. Also, I assume you have your own full test suite which could catch issues.
Since the upstream tensorflow from 2.5+ won’t compile for Jetson, it makes sense for you to submit your patch to upstream, so it can benefit all Jetson users, and be managed by upstream team.

AastaLLL · May 4, 2022, 2:14am

Hi,

It seems that we don’t do any specific.

Just try to build TensorFlow v2.8.0 on JetPack 5.0DP.
It can compile with some updates to allow TensorRT 8.4.

Below is the patch and script for your reference.
tf_2.8.0.patch (1.4 KB)
build_tensorflow.sh (2.1 KB)

Thanks.

user100090 · May 4, 2022, 4:27am

As @dusty_nv mentioned here , the error I encountered about “copysign” was due to the compiler version that is bundled with JP 4.6.1 (ubuntu 18). JP 5.0 has a newer version of compiler, so you will NOT see the error.

You need to use JP 4.6.x to reproduce.

AastaLLL · May 5, 2022, 2:54am

Hi,

I will check this issue with our internal team.
Thanks.

AastaLLL · May 11, 2022, 5:15am

Hi,

Thanks for your patience.

We have checked this with our internal team.
They didn’t meet the copysign error when compiling the TensorFlow 2.7 on JetPack 4.6.1.
So would you mind using the TensorFlow 2.7 instead?

For the TensorRT compatibility issue, you can get around it with a similar patch as shared above.

Thanks.

Topic		Replies	Views
TensorFlow 1.11.0 wheel with JetPack 3.3 Jetson TX2	103	45403	November 13, 2019
Failed building wheel for h5py on Jetpack 4.6.1 on Xavier NX Jetson Xavier NX tensorflow , nvbugs	19	4051	July 20, 2022
Building Tensorflow 2.0 and Tensorflow Lite Python Runtime Jetson Nano	6	3066	October 28, 2019
TensorFlow for Jetson TX2! Jetson TX2	113	47668	September 21, 2023
Official TensorFlow for Jetson AGX Xavier Jetson AGX Xavier kb	97	41828	September 5, 2023
Tensorflow version requirements Jetson Xavier NX tensorflow , jetson-inference	4	846	June 29, 2022
Official TensorFlow for Jetson AGX XavierNX Jetson Xavier NX tensorflow , kb	12	13312	November 28, 2023
Installing Tensorflow on Jetson Nano Ubuntu 18.04 Jetson Nano tensorflow	16	2206	April 11, 2024
Tensorflow 1.2.0 GPU on TX2 Jetson TX2	25	4885	October 18, 2021
Tensorflow 2.x on Jetson nano Jetson Nano tensorflow	23	7178	October 18, 2021

Tensorflow 2.7 and Jetpack 4.6.1

Related topics