Tensorflow 2.x on Jetson nano

Sorry if this is a duplicate, but I’m confused. I used this SD card image jetbot_image_v0p4p0.zip from Jetbot Wiki. When I import Tensorflow and check the version it shows that it is v 1.14 !! I was expecting Tensorflow 2.x in Jetpack 4.3.

is there a new SD image based on Tensorflow 2.x? OR, can I upgrade my Jetson developer kit to TF 2.x? any pointers appreciated.

Hi,

You can install TensorFlow 2.1 with this command:

$ sudo pip3 install --pre --extra-index-url https://developer.download.nvidia.com/compute/redist/jp/v43 tensorflow==2.1.0+nv20.3

For more information, please check this document:

Thanks.

thank you for the quick response. Can you please explain to a newbie what will this command do to the environment I have (SD card image I linked above, with Jetpack 4.3). I understand it will install tensorflow 2.1, but I already have TF 1.14, so will I end up with BOTH tensorflow versions installed? shouldn’t they be in separate python environments at least?? also, in jetpack 4.3 it seems that there are two versions of python; when I do which python on a terminal, I get python 2.7, but I can create python 3.6 notebooks on the jupyter lab interface (when logged into the jetson nano). Also, why use sudo?

Hi,

1. No. It will uninstall TF 1.14 and replace it with a new one.

2. We don’t provide TensorFlow package for python2 anymore.
That’s why you will need to install it with pip3 rather than pip.

3 sudo give you the authority to install package.

Thanks.

I tried the above, but ran into errors. First one was about enum:
AttributeError: module ‘enum’ has no attribute ‘IntFlag’. I uninstalled enum34 according to a SO answer. Now i’m getting an error about lapack/libblas:
numpy.distutils.system_info.NotFoundError: No lapack/blas resources found.

full output:
jetbot@jetson-4-3:~$ sudo pip3 install --pre --extra-index-url https://developer.download.nvidia.com/compute/redist/jp/v43 tensorflow==2.1.0+nv20.3
[sudo] password for jetbot:
Looking in indexes: Simple index, https://developer.download.nvidia.com/compute/redist/jp/v43
Collecting tensorflow==2.1.0+nv20.3
Using cached https://developer.download.nvidia.com/compute/redist/jp/v43/tensorflow/tensorflow-2.1.0%2Bnv20.3-cp36-cp36m-linux_aarch64.whl (236.9 MB)
Collecting tensorflow-estimator<2.2.0,>=2.1.0rc0
Using cached tensorflow_estimator-2.1.0-py2.py3-none-any.whl (448 kB)
Requirement already satisfied: google-pasta>=0.1.6 in /usr/local/lib/python3.6/dist-packages (from tensorflow==2.1.0+nv20.3) (0.1.8)
Collecting tensorboard<2.2.0,>=2.1.0
Using cached tensorboard-2.1.1-py3-none-any.whl (3.8 MB)
Requirement already satisfied: gast==0.2.2 in /usr/local/lib/python3.6/dist-packages (from tensorflow==2.1.0+nv20.3) (0.2.2)
Requirement already satisfied: termcolor>=1.1.0 in /usr/local/lib/python3.6/dist-packages (from tensorflow==2.1.0+nv20.3) (1.1.0)
Requirement already satisfied: wheel>=0.26; python_version >= “3” in /usr/lib/python3/dist-packages (from tensorflow==2.1.0+nv20.3) (0.30.0)
Requirement already satisfied: numpy<2.0,>=1.16.0 in /usr/local/lib/python3.6/dist-packages (from tensorflow==2.1.0+nv20.3) (1.16.1)
Collecting keras-applications>=1.0.8
Using cached Keras_Applications-1.0.8-py3-none-any.whl (50 kB)
Collecting keras-preprocessing>=1.1.0
Using cached Keras_Preprocessing-1.1.0-py2.py3-none-any.whl (41 kB)
WARNING: Retrying (Retry(total=4, connect=None, read=None, redirect=None, status=None)) after connection broken by ‘ReadTimeoutError(“HTTPSConnectionPool(host=‘developer.download.nvidia.com’, port=443): Read timed out. (read timeout=15)”,)’: /compute/redist/jp/v43/opt-einsum/
Collecting opt-einsum>=2.3.2
Using cached opt_einsum-3.2.1-py3-none-any.whl (63 kB)
Requirement already satisfied: grpcio>=1.8.6 in /usr/local/lib/python3.6/dist-packages (from tensorflow==2.1.0+nv20.3) (1.26.0)
Requirement already satisfied: protobuf>=3.8.0 in /usr/local/lib/python3.6/dist-packages (from tensorflow==2.1.0+nv20.3) (3.11.2)
Collecting scipy==1.4.1; python_version >= “3”
Using cached scipy-1.4.1.tar.gz (24.6 MB)
Installing build dependencies … done
Getting requirements to build wheel … done
Preparing wheel metadata … error
ERROR: Command errored out with exit status 1:
command: /usr/bin/python3 /usr/local/lib/python3.6/dist-packages/pip/_vendor/pep517/_in_process.py prepare_metadata_for_build_wheel /tmp/tmpsi_hop9e
cwd: /tmp/pip-install-lz7w55n7/scipy
Complete output (140 lines):
lapack_opt_info:
lapack_mkl_info:
customize UnixCCompiler
libraries mkl_rt not found in [‘/usr/local/lib’, ‘/usr/lib’, ‘/usr/lib/aarch64-linux-gnu’]
NOT AVAILABLE

openblas_lapack_info:
customize UnixCCompiler
customize UnixCCompiler
  libraries openblas not found in ['/usr/local/lib', '/usr/lib', '/usr/lib/aarch64-linux-gnu']
  NOT AVAILABLE

openblas_clapack_info:
customize UnixCCompiler
customize UnixCCompiler
  libraries openblas,lapack not found in ['/usr/local/lib', '/usr/lib', '/usr/lib/aarch64-linux-gnu']
  NOT AVAILABLE

atlas_3_10_threads_info:
Setting PTATLAS=ATLAS
customize UnixCCompiler
  libraries lapack_atlas not found in /usr/local/lib
customize UnixCCompiler
  libraries tatlas,tatlas not found in /usr/local/lib
customize UnixCCompiler
  libraries lapack_atlas not found in /usr/lib
customize UnixCCompiler
  libraries tatlas,tatlas not found in /usr/lib
customize UnixCCompiler
  libraries lapack_atlas not found in /usr/lib/aarch64-linux-gnu
customize UnixCCompiler
  libraries tatlas,tatlas not found in /usr/lib/aarch64-linux-gnu
<class 'numpy.distutils.system_info.atlas_3_10_threads_info'>
  NOT AVAILABLE

atlas_3_10_info:
customize UnixCCompiler
  libraries lapack_atlas not found in /usr/local/lib
customize UnixCCompiler
  libraries satlas,satlas not found in /usr/local/lib
customize UnixCCompiler
  libraries lapack_atlas not found in /usr/lib
customize UnixCCompiler
  libraries satlas,satlas not found in /usr/lib
customize UnixCCompiler
  libraries lapack_atlas not found in /usr/lib/aarch64-linux-gnu
customize UnixCCompiler
  libraries satlas,satlas not found in /usr/lib/aarch64-linux-gnu
<class 'numpy.distutils.system_info.atlas_3_10_info'>
  NOT AVAILABLE

atlas_threads_info:
Setting PTATLAS=ATLAS
customize UnixCCompiler
  libraries lapack_atlas not found in /usr/local/lib
customize UnixCCompiler
  libraries ptf77blas,ptcblas,atlas not found in /usr/local/lib
customize UnixCCompiler
  libraries lapack_atlas not found in /usr/lib
customize UnixCCompiler
  libraries ptf77blas,ptcblas,atlas not found in /usr/lib
customize UnixCCompiler
  libraries lapack_atlas not found in /usr/lib/aarch64-linux-gnu
customize UnixCCompiler
  libraries ptf77blas,ptcblas,atlas not found in /usr/lib/aarch64-linux-gnu
<class 'numpy.distutils.system_info.atlas_threads_info'>
  NOT AVAILABLE

atlas_info:
customize UnixCCompiler
  libraries lapack_atlas not found in /usr/local/lib
customize UnixCCompiler
  libraries f77blas,cblas,atlas not found in /usr/local/lib
customize UnixCCompiler
  libraries lapack_atlas not found in /usr/lib
customize UnixCCompiler
  libraries f77blas,cblas,atlas not found in /usr/lib
customize UnixCCompiler
  libraries lapack_atlas not found in /usr/lib/aarch64-linux-gnu
customize UnixCCompiler
  libraries f77blas,cblas,atlas not found in /usr/lib/aarch64-linux-gnu
<class 'numpy.distutils.system_info.atlas_info'>
  NOT AVAILABLE

accelerate_info:
  NOT AVAILABLE

lapack_info:
customize UnixCCompiler
  libraries lapack not found in ['/usr/local/lib', '/usr/lib', '/usr/lib/aarch64-linux-gnu']
  NOT AVAILABLE

lapack_src_info:
  NOT AVAILABLE

  NOT AVAILABLE

setup.py:420: UserWarning: Unrecognized setuptools command ('dist_info --egg-base /tmp/pip-modern-metadata-_pokvxv8'), proceeding with generating Cython sources and expanding templates
  ' '.join(sys.argv[1:])))
Running from scipy source directory.
/usr/local/lib/python3.6/dist-packages/numpy/distutils/system_info.py:636: UserWarning:
    Atlas (http://math-atlas.sourceforge.net/) libraries not found.
    Directories to search for the libraries can be specified in the
    numpy/distutils/site.cfg file (section [atlas]) or by setting
    the ATLAS environment variable.
  self.calc_info()
/usr/local/lib/python3.6/dist-packages/numpy/distutils/system_info.py:636: UserWarning:
    Lapack (http://www.netlib.org/lapack/) libraries not found.
    Directories to search for the libraries can be specified in the
    numpy/distutils/site.cfg file (section [lapack]) or by setting
    the LAPACK environment variable.
  self.calc_info()
/usr/local/lib/python3.6/dist-packages/numpy/distutils/system_info.py:636: UserWarning:
    Lapack (http://www.netlib.org/lapack/) sources not found.
    Directories to search for the sources can be specified in the
    numpy/distutils/site.cfg file (section [lapack_src]) or by setting
    the LAPACK_SRC environment variable.
  self.calc_info()
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/pip/_vendor/pep517/_in_process.py", line 280, in <module>
    main()
  File "/usr/local/lib/python3.6/dist-packages/pip/_vendor/pep517/_in_process.py", line 263, in main
    json_out['return_val'] = hook(**hook_input['kwargs'])
  File "/usr/local/lib/python3.6/dist-packages/pip/_vendor/pep517/_in_process.py", line 133, in prepare_metadata_for_build_wheel
    return hook(metadata_directory, config_settings)
  File "/usr/local/lib/python3.6/dist-packages/setuptools/build_meta.py", line 156, in prepare_metadata_for_build_wheel
    self.run_setup()
  File "/usr/local/lib/python3.6/dist-packages/setuptools/build_meta.py", line 237, in run_setup
    self).run_setup(setup_script=setup_script)
  File "/usr/local/lib/python3.6/dist-packages/setuptools/build_meta.py", line 142, in run_setup
    exec(compile(code, __file__, 'exec'), locals())
  File "setup.py", line 540, in <module>
    setup_package()
  File "setup.py", line 536, in setup_package
    setup(**metadata)
  File "/usr/local/lib/python3.6/dist-packages/numpy/distutils/core.py", line 137, in setup
    config = configuration()
  File "setup.py", line 435, in configuration
    raise NotFoundError(msg)
numpy.distutils.system_info.NotFoundError: No lapack/blas resources found.
----------------------------------------

ERROR: Command errored out with exit status 1: /usr/bin/python3 /usr/local/lib/python3.6/dist-packages/pip/_vendor/pep517/_in_process.py prepare_metadata_for_build_wheel /tmp/tmpsi_hop9e Check the logs for full command output.
jetbot@jetson-4-3:~$

Hi,

numpy.distutils.system_info.NotFoundError: No lapack/blas resources found.

This indicates that there is no lapack/blas library in your environment.
Please install them with this command:

$ sudo apt-get install gfortran libopenblas-dev liblapack-dev

Thanks.

done! then I go back to install tensorflow 2.1 using

sudo pip3 install --pre --extra-index-url Index of /compute/redist/jp/v43 tensorflow==2.1.0+nv20.3

, but it takes forever and never finishes installing scipy… it seems stuck forever on "Building wheel for scipy (PEP 517)

after some Googling I installed pep517, then tried installing tensorflow 2.1 as above, but still stuck on scipy… no error reported or anything, just stuck “Building wheel for scipy”

sudo pip3 install pep517

so I left it “stuck” and didn’t kill it,… eventually it finished successfully. Updating this thread just in case someone else runs into this issue

Thanks for the post! I was able to get tensorflow2 installed on the my Nano, by following the above instructions.

However, this install does not allow for the use of the GPU. I confirmed this by running some code while monitoring the GPU and CPU usage, using jtop … the GPU stays at 0% at all points:

TF appears to not be able to find the necessary CUDA files. You can see this when you import tensorflow as tf:

tf2020-05-31 14:24:03.544504: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library ‘libcudart.so.10.0’; dlerror: libcudart.so.10.0: cannot open shared object file: No such file or directory

2020-05-31 14:24:03.544570: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.

.version

2020-05-31 14:24:06.309734: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library ‘libnvinfer.so.6’; dlerror: libnvinfer.so.6: cannot open shared object file: No such file or directory

2020-05-31 14:24:06.309935: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library ‘libnvinfer_plugin.so.6’; dlerror: libnvinfer_plugin.so.6: cannot open shared object file: No such file or directory

2020-05-31 14:24:06.309972: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:30] Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.

My system has CUDA-10.2 installed in /usr/local/cuda-10.2 … the files that TF is looking for are close, but differ in slightly in name and location.

For instance, my installation contains the file:

/usr/local/cuda-10.2/targets/aarch64-linux/lib/libcudart.so

But there is no file called libcudart.so.10.0, which TensorFlow wants.

Any help would be great. It looks like a configuration issue …


This is the test code that I’m using. It’s Google’s simple code for MNIST, and takes about 13 seconds on my Mac to train, and 65 seconds on the Nano (hopefully much faster, once GPU is working!)

# simple example from
#  https://www.tensorflow.org/datasets/keras_example
# first needed to 
# $ pip install tensorflow_datasets

import tensorflow.compat.v2 as tf
import tensorflow_datasets as tfds

tfds.disable_progress_bar()
tf.enable_v2_behavior()

(ds_train, ds_test), ds_info = tfds.load(
    'mnist',
    split=['train', 'test'],
    shuffle_files=True,
    as_supervised=True,
    with_info=True,
)


def normalize_img(image, label):
    """Normalizes images: `uint8` -> `float32`."""
    return tf.cast(image, tf.float32) / 255., label


ds_train = ds_train.map(
    normalize_img, num_parallel_calls=tf.data.experimental.AUTOTUNE)
ds_train = ds_train.cache()
ds_train = ds_train.shuffle(ds_info.splits['train'].num_examples)
ds_train = ds_train.batch(128)
ds_train = ds_train.prefetch(tf.data.experimental.AUTOTUNE)

ds_test = ds_test.map(
    normalize_img, num_parallel_calls=tf.data.experimental.AUTOTUNE)
ds_test = ds_test.batch(128)
ds_test = ds_test.cache()
ds_test = ds_test.prefetch(tf.data.experimental.AUTOTUNE)

model = tf.keras.models.Sequential([
    tf.keras.layers.Flatten(input_shape=(28, 28, 1)),
    tf.keras.layers.Dense(128, activation='relu'),
    tf.keras.layers.Dense(10, activation='softmax')
])
model.compile(
    loss='sparse_categorical_crossentropy',
    optimizer=tf.keras.optimizers.Adam(0.001),
    metrics=['accuracy'],
)

model.fit(
    ds_train,
    epochs=6,
    validation_data=ds_test,
)

Out of curiosity I was running your code on my Jetson Nano.
With CPU: app 55 sec
With GPU: app 75 sec !!!
(don’t be too shocked, I saw other examples where the Nano GPU really speeds up processing by a factor of 10 or so)

Btw, I also don’t have libcudart.so.10.0 but I have libcudart.so.10.2 - which matches the Cuda version 10.2
(using TF 2.2.1)

Hi @mictiemix,

Thank you for the test runs and quick feedback! It looks like you got TF2.2.1 working on the Nano, with (selectable) GPU support then? Was this a custom build, or a build from NVIDIA?

I have tried building TF2 natively on Nano, using some guides on the 'net, but got stuck. I eventually got Bazel installed - TF2 is quite picky on the exact version - but the install asked for NCCL support (which apparently can be bypassed), but then I found this forum, with a direct download.

Your speed tests are informative. I’ve written codes that show essentially no speedup on GPU, since most of the time in the code is spend doing memory-shuffling and other operations (which are actually quite slow on the Nano).

The above test example for MNIST uses a very small network, is likely memory / CPU bound, as your tests appear to show.

I rewrote some test code, for a more realistic use case (large networks). The code below stacks the deck for the GPU, and also breaks out the time into three sections:

time to create initial variables (in numpy)
time to create the TF2 variables
time to run a lot of matrix multiplies 

I don’t have my nano on me currently, but have included the results from my MacBook, as a reference … hoping this is the type of code that can show some nice speedup, on Nano + TF2 + GPU.

# simple test example with large matrices,
# for tests with TF2 on various platforms

import tensorflow as tf
import numpy as np
import time

print('running ...\n')

start_time = time.time()  # start the timer

# create initial variables (observations are rows)
x_init = np.random.randn(1, 10000)
y_init = np.random.randn(1, 10000)
W_init = np.random.randn(10000, 10000)  # 100,000 variables * 4 (bytes/float32) = 400 MB matrix

# print out init time, and reset timer
init_time = time.time() - start_time
print('time for creating initial values:  ', np.round(init_time, 2))

start_time = time.time()  # reset the timer

# setup TF2 variables (and constants)
y = tf.constant(x_init, dtype=tf.float32)
x = tf.constant(y_init, dtype=tf.float32)

W = tf.Variable(W_init, dtype=tf.float32)

# print out setup time, and reset timer
setup_time = time.time() - start_time
print('time for initializing TF variables:', np.round(setup_time, 2))

start_time = time.time()  # reset the timer

# run lots of matrix muliplies
for i in range(5):
    y_hat = x @ W ** 5  # heat ALL the processors!

# print out the run time and total time
run_time = time.time() - start_time
print('run time (matrix multiplies):      ', np.round(run_time, 2))
start_time = time.time()  # reset the timer

total_time = init_time + setup_time + run_time
print('\ntotal elapsed:                     ', np.round(total_time, 2))

print('\ndone!')

# output on MacBoook Pro (2.5 GHz Quad-Core Intel Core i7),
# default TF2 install from github (not processor optimized)
#
#    running ...
#
#    time for creating initial values:   3.42
#    time for initializing TF variables: 0.26
#    run time (matrix multiplies):       2.58
#
#    total elapsed:                      6.27
#
#    done!

@spindown:
yes, I build TF2.1.1 from source, it was quite a hassle and took me 2-3 times 40 hours each until I got it running.
The main issues I faced was the find the right version for Bazel and also for cuDNN.
As I have Jetpack 4.4 installed, also cuDNN 8.0 is installed, but as far as I understand TF2.1.1. supports only cuDNN7.6.
If you are interested I can provide you more details or even upload my TF2.1.1-whl-file to some shared drive.

If you are interested I can provide you more details or even upload my TF2.1.1-whl-file to some shared drive.

Yes, thank you! I’m sure this would be quite helpful to a number of people. Jetson has been quite a rabbit hole, even getting it setup and running correctly in the first place …

If you have a github account, you could just upload the .whl to a repo. That’s probably the simplest shared space, also also allows for other resources (build notes and scripts), if you wish.

I’ve seen other guides to building TF2 on Jetson, like Building TensorFlow 2.0.0 on Jetson Nano, but got stuck at some point. (Also, it is not clear if this is a GPU version.)

Longer term, if would be great if NVIDIA could fold in GPU support to their TF2 builds … that way, GPU would be available for the different platforms (like the Xavier!), and stay up to date with future jetpack or TF2 releases.

Did not try it myself yet, but looking at the script this will compile TF2.0.0 with CUDA support.

@spindown:
as far as I know Github allows a maximum file-size of 100MB, and the whl is app 150MB, so I decided to use MEGA.

You should be able to download tensorflow-2.1.1-cp36-cp36m-linux_aarch64.whl (GPU-version) from File on MEGA (download via browser).

TF2.1.1 seems not working with cuDNN 8.0 which is installed via Jetpack 4.4., so you need to install cuDNN7.6.4 following the guidance from Installation Guide :: NVIDIA Deep Learning cuDNN Documentation.
When installing the .deb-file I got ‘broken dependencies’ so I recommend to use the tgz-file - which allows you to have both versions on cuDNN on your device. I use ‘update-alternatives’ for selection.

In addition I set these 2 environment variables (although I not sure if they are needed): TF_CUDNN_VERSION=7.6.4 and
JETSON_CUDNN=7.6.4

Let me know if you need further info

1 Like

Thank you so much it works. I have tested with classification and regression.

1 Like

Hi, thanks for the compiled Tensorflow! I’m having trouble finding the prebuilt cuDNN 7.6.4, do you have a link or an apt repository for it? Or did you also compile cuDNN yourself?

Strange,
it seems that NVIDIA has removed cuDNN for ARM64 …

I found it here: https://developer.nvidia.com/cuda-toolkit/arm, the DEB instalation worked, but TF was somehow broken (I’m using NX, not Nano), it got stuck after import. Let’s hope that Nvidia fixes this soon.

2 Likes

sorry to hear that it’s not working for you …