Is Tensorflow 2.0 on Jetson TX2 supported?

Hello,

I am able to install earlier TF(Tensorflow) versions by following the instructions mentioned here: https://docs.nvidia.com/deeplearning/frameworks/install-tf-jetson-platform/index.html

When will NVIDIA release the .whl for TF2.0? Cant find one here yet: https://developer.download.nvidia.com/compute/redist/jp/v42/tensorflow-gpu/

Thanks!

Hi,

We don’t have an v2.0 official TensorFlow package yet.
The latest package for Jetson is v1.14.0.

Although there is no available prebuilt data, you can still build it from source on your own.
Here is related topic which can give you some ideas:
https://devtalk.nvidia.com/default/topic/1055131/jetson-agx-xavier/building-tensorflow-1-13-on-jetson-xavier/

Thanks.

Hi,

Missed to see this comment earlier.
Yes, that’s what i went ahead and did. Successfully, built the tensorflow-2.0 on Jetson TX2.

Ubuntu: 18.04
CUDA: 10.0
Tensorrt: 5.1
Python: 3.6.8 {NOTE: I used a virtual env as i was experimenting with different version}

Tensorflow: 2.0.0rc0 (From Github source: https://github.com/tensorflow/tensorflow/tree/r2.0)
Bazel: 26.1 (https://github.com/bazelbuild/bazel/releases, https://docs.bazel.build/versions/master/install-ubuntu.html)

Step that worked for me:

Start here --> https://docs.nvidia.com/deeplearning/frameworks/install-tf-jetson-platform/index.html#prereqs

pip3 install -U wheel
bazel build --config=opt --config=cuda --config=nonccl --config=v2 --local_resources 8192,2.0,1.0  //tensorflow/tools/pip_package:build_pip_package

{NOTE: This step can take a lot of time if done on TX2. Cross-compiling is another option but i didn’t do that.}

./bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg
pip3 install -U /tmp/tensorflow_pkg/tensorflow-2.0.0rc0-cp36-cp36m-linux_aarch64.whl

Hope it helps someone who is trying the same!
Useful Resources:

Hey @srkm009, how did you compile Bazel 0.26.1? Did it work directly out of the box or did you have to do anything “special” for that?

Hi @klein.shaked:

Yes, as far as i remember it worked out-of-the-box. I just followed the instructions given here: https://docs.bazel.build/versions/master/install-ubuntu.html to build it from the source.

It is important that you have to check for the bazel version that works best with the Tensorflow’s version etc. Sometimes, the latest bazel version may be too much for TF and might require down-grade etc.

Hey @srkm009

Which gcc did you use?

I’m using

gcc --version
gcc (Ubuntu/Linaro 7.4.0-1ubuntu1~18.04.1) 7.4.0
Copyright (C) 2017 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

But seeing this error:

ERROR: /xavier_ssd/home/tensorflow/tensorflow/lite/kernels/BUILD:321:1: C++ compilation of rule '//tensorflow/lite/kernels:builtin_op_kernels' failed (Exit 1)
In file included from ./tensorflow/lite/kernels/internal/optimized/depthwiseconv_uint8.h:23:0,
                 from ./tensorflow/lite/kernels/internal/optimized/depthwiseconv_multithread.h:22,
                 from tensorflow/lite/kernels/depthwise_conv.cc:28:
./tensorflow/lite/kernels/internal/optimized/depthwiseconv_uint8_3x3_filter.h: In static member function 'static void tflite::optimized_ops::depthwise_conv::PackMacroBlock<(tflite::DepthwiseConvImplementation)3, (tflite::DepthwiseConvDepthMultiplication)0, 0>::PackMacroBlockNeon(const uint8*, int8*, const tflite::optimized_ops::depthwise_conv::DepthwiseConvDotProdParams*)':
./tensorflow/lite/kernels/internal/optimized/depthwiseconv_uint8_3x3_filter.h:5981:47: note: use -flax-vector-conversions to permit conversions between vectors with differing element types or numbers of subparts
           input_data_a = vld1q_u8(input_data_0);
                                               ^
./tensorflow/lite/kernels/internal/optimized/depthwiseconv_uint8_3x3_filter.h:5981:47: error: cannot convert 'uint8x16_t {aka __vector(16) unsigned char}' to 'int8x16_t {aka __vector(16) signed char}' in assignment
./tensorflow/lite/kernels/internal/optimized/depthwiseconv_uint8_3x3_filter.h:5982:65: error: cannot convert 'uint8x16_t {aka __vector(16) unsigned char}' to 'int8x16_t {aka __vector(16) signed char}' in assignment
           input_data_b = vld1q_u8(input_data_0 + 1 * input_depth);
                                                                 ^
./tensorflow/lite/kernels/internal/optimized/depthwiseconv_uint8_3x3_filter.h:5983:65: error: cannot convert 'uint8x16_t {aka __vector(16) unsigned char}' to 'int8x16_t {aka __vector(16) signed char}' in assignment
           input_data_c = vld1q_u8(input_data_0 + 2 * input_depth);
                                                                 ^
./tensorflow/lite/kernels/internal/optimized/depthwiseconv_uint8_3x3_filter.h:5984:65: error: cannot convert 'uint8x16_t {aka __vector(16) unsigned char}' to 'int8x16_t {aka __vector(16) signed char}' in assignment
           input_data_d = vld1q_u8(input_data_0 + 3 * input_depth);

any ideas?

@klein.shaked, according to the tensorflow compatibility website, I would try a lower version of gcc.

https://www.tensorflow.org/install/source

You should be able to specify the path of your lower gcc-7 version while running ./configure

In case this is useful to anyone, I’ve found that you can build just about anything for the Jetsons on AWS arm64 instances (which are only available in the US East (Ohio) region). It’s much faster, and I wouldn’t dream of trying to build anything directly on a Jetson after getting this working.

Here’s a repo I made that does this for opencv: https://github.com/vincilab/build-arm-opencv

I’ve also built Tensorflow 1.15.0 the same way.

Hi,

The above examples are for CPU build which is not using the GPU of jetson. Has anyone compiled tensorflow-gpu 2.0 for Jetson? Is tensorflow-gpu 2.0 still not available for Jetson? Again i don’t want to tensorflow-lite because it doesn’t support all the functions.

Thanks

Ashik

What is the reason that you say it’s only CPU and not using the GPU of the Jetson?

I’m asking because I have built TF2 for ARM on Xavier and when I used some code examples it seemed to behave as expected. Maybe I missed something?

Thank you

TF 1.15+ should have GPU support baked into the main tensorflow package: https://github.com/tensorflow/tensorflow/releases/tag/v1.15.0

— Srkm009 used the following command “pip3 install -U /tmp/tensorflow_pkg/tensorflow-2.0.0rc0-cp36-cp36m-linux_aarch64.whl” — it clearly says cpu.
— And according my understanding tensorflow is written for x86 integrated with gpu or tpu. On the other hand jetson has arm core. So does it compile properly for jetson with integrated gpu, since it uses arm core?.
— They have released tensorflow-lite as the official version of tensorflow for arm type of computer. they also release tensorflow lite - gpu delegate for arm with gpu processor(Mobile gpu-- mali).
— So my question is why their is tensorflow lite version(arm64) if tensorflow compiles works for arm64? if we compile tensorflow-gpu 2.0 for jetson will it work? how do you test whether the tf-gpu works on it and has a optimum performance?
I may be wrong here but i’m just trying to understand

Hi,

We have released a TensorFlow v2.0 package for Jetson.
Please reflash your device with JetPack4.3 and install TensorFlow with the command here:
https://docs.nvidia.com/deeplearning/frameworks/install-tf-jetson-platform/index.html

Thanks.

Does it include the C library support as well? Or is it only for Python3?

The reason I’m asking is because we have 3 different use cases:

  1. TF2 Dockerfile for C lib support
  2. TF2 Dockerfile for Python support
  3. TF2 Dockerfile for both Python and C lib support

Hi,
Thank you so much. I install tensorflow 2.1.0 on TX2.

I am trying to build tensorflow2.0.0 on tx2. below are the details.

System information

  • OS Platform and Distribution: jetpack 4.3
  • TensorFlow installed from (source or binary): source
  • TensorFlow version (use command below): 2.0.0
  • Python version: 2.7
  • Bazel version (if compiling from source): 0.26.1
  • GCC/Compiler version (if compiling from source): 7.5
  • CUDA/cuDNN version: 10.0/7.6.3
  • GPU model and memory: Tegra TX2

Describe the current behavior
when I try running sample application it is crashing. With error logs as “CUDA runtime implicit initialization on GPU:0 failed. Status: unknown error”

Describe the expected behavior
it should start a session.

Standalone code to reproduce the issue
Sample Code

#include <tensorflow/core/platform/env.h>
#include <tensorflow/core/public/session.h>
#include <iostream>
using namespace std;
using namespace tensorflow;

int main()
{
    Session* session;
    Status status = NewSession(SessionOptions(), &session);
    if (!status.ok()) {
        cout << status.ToString() << "\n";
    }
    session->Close();
    cout << "Session successfully created.\n";
}

CMakeLists.txt

cmake_minimum_required(VERSION 3.10)

project(test_hello)

set(TENSORFLOW_LIBRARIES tensorflow_cc protobuf)
add_executable(example example.cpp)
set(CUDA_NVCC_FLAGS
        ${CUDA_NVCC_FLAGS};
        -O3 -gencode arch=compute_30,code=sm_30;
        --std=c++11
        )
set(CMAKE_CXX_STANDARD 11)
set(CMAKE_CXX_STANDARD_REQUIRED ON)

# Link the Tensorflow library.
# TensorFlow headers
include_directories("/usr/local/include/")
include_directories("/usr/local/include/tensorflow/")
include_directories("/usr/local/include/third-party/")
target_link_libraries(example "/usr/local/lib/libtensorflow_cc.so")

# You may also link cuda if it is available.
find_package(CUDA)
if(CUDA_FOUND)
  target_link_libraries(example ${CUDA_LIBRARIES})
endif()

Other info / logs

$ ./example 
2020-07-08 15:01:34.122381: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
2020-07-08 15:01:34.127413: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:973] ARM64 does not support NUMA - returning NUMA node zero
2020-07-08 15:01:34.127566: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties: 
name: NVIDIA Tegra X2 major: 6 minor: 2 memoryClockRate(GHz): 1.3
pciBusID: 0000:00:00.0
2020-07-08 15:01:34.127608: I tensorflow/stream_executor/platform/default/dlopen_checker_stub.cc:25] GPU libraries are statically linked, skip dlopen check.
2020-07-08 15:01:34.127693: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:973] ARM64 does not support NUMA - returning NUMA node zero
2020-07-08 15:01:34.127815: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:973] ARM64 does not support NUMA - returning NUMA node zero
2020-07-08 15:01:34.127879: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0
Internal: CUDA runtime implicit initialization on GPU:0 failed. Status: unknown error
Segmentation fault (core dumped)

I am very new to tensorflow and cuda programming both …not sure what is going on. Appreciate the help.

Hi harendra,

Please refer to TensorFlow for Jetson TX2! and try with the latest JetPack 4.4 GA.
If issue still present, please open a new topic.
Thanks