Available: TensorFlow 1.5 for Jetson TX2

jesp.hc · January 17, 2018, 9:41am

Hi guys,

Thank you for all the good information that you make available to us here.

I just wanted to share with you that I successfully build and installed TensorFlow 1.5 on the Jetson TX2.
I have made the wheel-file for installing publicly available at: GitHub - JesperChristensen89/TensorFlow-Jetson-TX2: Pre-built wheel files for installing TensorFlow on Jetson TX2

I have tested the installation with CUDA 8 and cuDNN 6 and have successfully deployed SSD models from the TensorFlow Object Detection API in a Jupyter Notebook environment with TensorFlow.

CHuang1 · January 17, 2018, 11:27pm

jesp-hc,
Thanks for sharing your effort with Jetson community.

AastaLLL · January 18, 2018, 2:16am

Thanks for your sharing.

We have released CUDA 9.0 in JetPack3.2 DP.
Could you also try to build TensorFlow-1.5 with JetPack3.2 DP?

Thanks.

jesp.hc · January 18, 2018, 8:11am

I tested TensorFlow 1.5 along with CUDA 9.0 and cuDNN 7.0 from JetPack3.2 DP and can confirm that it works as well.
I am still not able to run any of the Faster R-CNN models, see: Faster R-CNN: too many resources requested for launch - Jetson TX2 - NVIDIA Developer Forums
However smaller models as SSD works perfectly fine.

AastaLLL · January 22, 2018, 8:29am

Hi,

Let’s check this issue on topic 1028798:

Thanks.

kallud32qg · February 5, 2018, 12:48pm

Hi Jesper & AastaLLL

I’ve built tensorflow 1.5 on Jetpack 3.2 with the following combinations:

Bazel 0.8.0 / Bazel 0.9.0
GCC 4.8.5 / 5.4.0
CUDA 9.0

My experience on Tx2 is that the stability (doesn’t always start) and inference performance isn’t that great.
Could you please share some some of the details related to your build so I can reproduce on Jetpack 3.2.

I’m on Jetpack 3.2 because the performance/stability was roughly the same on 3.1.

Best regards,
Kalevi

jesp.hc · February 5, 2018, 1:11pm

Hi Kalevi,

Could you please expand a bit on you issues and what informations you seek?

Have you tried with my built on TF 1.5 on JetPack 3.2?

Best,
Jesper

kallud32qg · February 6, 2018, 8:40am

Hi Jesper,

I tried your wheel but as its built for Jetpack 3.1 and CUDA 8 it can’t load the cuda 8 libraries.

I’m looking for some your specific build choices and assume you built this on Jetson TX2 itself ?

If I look at tensorflow.org how to build from source and the table at the very end of the desciption page it says that it has been tested with:

tensorflow_gpu-1.5.0, GPU, Python: 2.7, 3.3-3.6, GCC 4.8, Bazel 0.8.0, CuDNN:7, CUDA:9

I’m looking for the above information and also your ./configure choices.

Bellow you can see my configuration choices.

My problem is are running either inceptionV3 or ssd_mobilenet the launch time can be minutes and inference performance resembles inference on a i7-CPU

Best regards,
Kalevi

– ./configure –

Please input the desired Python library path to use. Default is [/usr/local/lib/python2.7/dist-packages]

Do you wish to build TensorFlow with jemalloc as malloc support? [Y/n]:
jemalloc as malloc support will be enabled for TensorFlow.

Do you wish to build TensorFlow with Google Cloud Platform support? [Y/n]: n
No Google Cloud Platform support will be enabled for TensorFlow.

Do you wish to build TensorFlow with Hadoop File System support? [Y/n]: n
No Hadoop File System support will be enabled for TensorFlow.

Do you wish to build TensorFlow with Amazon S3 File System support? [Y/n]: n
No Amazon S3 File System support will be enabled for TensorFlow.

Do you wish to build TensorFlow with XLA JIT support? [y/N]:
No XLA JIT support will be enabled for TensorFlow.

Do you wish to build TensorFlow with GDR support? [y/N]:
No GDR support will be enabled for TensorFlow.

Do you wish to build TensorFlow with VERBS support? [y/N]:
No VERBS support will be enabled for TensorFlow.

Do you wish to build TensorFlow with OpenCL SYCL support? [y/N]:
No OpenCL SYCL support will be enabled for TensorFlow.

Do you wish to build TensorFlow with CUDA support? [y/N]: y
CUDA support will be enabled for TensorFlow.

Please specify the CUDA SDK version you want to use, e.g. 7.0. [Leave empty to default to CUDA 9.0]:

Please specify the location where CUDA 9.0 toolkit is installed. Refer to README.md for more details. [Default is /usr/local/cuda]:
Please specify the cuDNN version you want to use. [Leave empty to default to cuDNN 7.0]:

Please specify the location where cuDNN 7 library is installed. Refer to README.md for more details. [Default is /usr/local/cuda]:

Please specify a list of comma-separated Cuda compute capabilities you want to build with.
You can find the compute capability of your device at: CUDA GPU Compute Capability | NVIDIA Developer.
Please note that each additional compute capability significantly increases your build time and binary size. [Default is: 3.5,5.2]

Do you want to use clang as CUDA compiler? [y/N]:
nvcc will be used as CUDA compiler.

Please specify which gcc should be used by nvcc as the host compiler. [Default is /usr/bin/gcc]:

Do you wish to build TensorFlow with MPI support? [y/N]:
No MPI support will be enabled for TensorFlow.

Please specify optimization flags to use during compilation when bazel option “–config=opt” is specified [Default is -march=native]:

Add “–config=mkl” to your bazel command to build with MKL support.
Please note that MKL on MacOS or windows is still not supported.
If you would like to use a local MKL instead of downloading, please set the environment variable “TF_MKL_ROOT” every time before build.

Would you like to interactively configure ./WORKSPACE for Android builds? [y/N]:
Not configuring the WORKSPACE for Android builds.

kallud32qg · February 6, 2018, 2:48pm

Hi,

Here is some data on what I describe as slow to launch. This was built with CUDA9. GCC 5, TF1.5, Bazel 0.9.1. Freshly booted TX2.

First run 1m30s
Second run 7s
Third run 3s

Best regards,
Kalevi

nvidia@tegra-ubuntu:~$ cat hellotf.py
import tensorflow as tf

hello = tf.constant(“helloe world”)
sess = tf.Session()
print (sess.run(hello))

nvidia@tegra-ubuntu:~$ time python hellotf.py
2018-02-06 14:37:58.414481: E tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:881] could not open file to read NUMA node: /sys/bus/pci/devices/0000:00:00.0/numa_node
Your kernel may have been built without NUMA support.
2018-02-06 14:37:58.414604: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1105] Found device 0 with properties:
name: NVIDIA Tegra X2 major: 6 minor: 2 memoryClockRate(GHz): 1.3005
pciBusID: 0000:00:00.0
totalMemory: 7.66GiB freeMemory: 6.12GiB
2018-02-06 14:37:58.414676: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1195] Creating TensorFlow device (/device:GPU:0) → (device: 0, name: NVIDIA Tegra X2, pci bus id: 0000:00:00.0, compute capability: 6.2)
2018-02-06 14:39:28.770466: I tensorflow/core/common_runtime/gpu/gpu_device.cc:859] Could not identify NUMA node of /job:localhost/replica:0/task:0/device:GPU:0, defaulting to 0. Your kernel may not have been built with NUMA support.
helloe world

real 1m34.633s
user 1m30.380s
sys 0m2.104s

nvidia@tegra-ubuntu:~$ time python hellotf.py
2018-02-06 14:39:45.115259: E tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:881] could not open file to read NUMA node: /sys/bus/pci/devices/0000:00:00.0/numa_node
Your kernel may have been built without NUMA support.
2018-02-06 14:39:45.115373: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1105] Found device 0 with properties:
name: NVIDIA Tegra X2 major: 6 minor: 2 memoryClockRate(GHz): 1.3005
pciBusID: 0000:00:00.0
totalMemory: 7.66GiB freeMemory: 4.90GiB
2018-02-06 14:39:45.115427: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1195] Creating TensorFlow device (/device:GPU:0) → (device: 0, name: NVIDIA Tegra X2, pci bus id: 0000:00:00.0, compute capability: 6.2)
2018-02-06 14:39:51.184749: I tensorflow/core/common_runtime/gpu/gpu_device.cc:859] Could not identify NUMA node of /job:localhost/replica:0/task:0/device:GPU:0, defaulting to 0. Your kernel may not have been built with NUMA support.
helloe world

real 0m8.195s
user 0m7.452s
sys 0m0.612s
nvidia@tegra-ubuntu:~$ time python hellotf.py
2018-02-06 14:39:59.755967: E tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:881] could not open file to read NUMA node: /sys/bus/pci/devices/0000:00:00.0/numa_node
Your kernel may have been built without NUMA support.
2018-02-06 14:39:59.756095: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1105] Found device 0 with properties:
name: NVIDIA Tegra X2 major: 6 minor: 2 memoryClockRate(GHz): 1.3005
pciBusID: 0000:00:00.0
totalMemory: 7.66GiB freeMemory: 4.90GiB
2018-02-06 14:39:59.756155: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1195] Creating TensorFlow device (/device:GPU:0) → (device: 0, name: NVIDIA Tegra X2, pci bus id: 0000:00:00.0, compute capability: 6.2)
2018-02-06 14:40:01.071648: I tensorflow/core/common_runtime/gpu/gpu_device.cc:859] Could not identify NUMA node of /job:localhost/replica:0/task:0/device:GPU:0, defaulting to 0. Your kernel may not have been built with NUMA support.
helloe world

real 0m3.425s
user 0m2.756s
sys 0m0.532s
nvidia@tegra-ubuntu:~$

AastaLLL · February 9, 2018, 7:33am

Hi, kallud32qg

Thanks for your feedback.

Could you also apply the #9 experiment on a desktop GPU?
This will figure out the unstable issue is from TensorFlow or TX2.

Thanks.

kallud32qg · February 9, 2018, 9:41am

Hi AastaLLL,

If I were to run the similar test on a desktop GPU that I ran on post #9 I would take a pre-built python wheel which would be called e.g. tensorflow_gpu-1.4.0rc0-cp27-none-linux_x86_64.whl. (please note the X86_64 in the naming convention)

The reason for building a python wheel myself is that there are no pre-built wheels for E.G. ARMv7 application processors and GPU support. This is the reason Jesper released his wheel.

As mentioned earlier I build Tensorflow on the board itself. I don’t cross compile on a PC so I would have to cross compile on the TX2 for x86 to run the test.

I’m more than happy to provide a detailed build steps if someone wants to replicate a build.

Best regards,
Kalevi

AastaLLL · February 12, 2018, 7:43am

Hi,

There is an official release for TensorFlow + GPU + x86 Linux machine.
You can install the TensorFlow package via apt-get directly.

We want to check the unstable issue is from TF implementation or Jetson-only.
Please help to apply the experiment on a desktop environment to narrow down the root cause.

Thanks.

kallud32qg · February 12, 2018, 11:50am

Hi AastaLLL

Thanks for you link. I’m aware of it. I assume the x86+GPU gets used a lot and it works.

I don’t have a Nvidia GPU desktop PC to try. However as you can seen from my earlier post feel free to save the following in to a file and run it:

import tensorflow as tf

hello = tf.constant(“helloe world”)
sess = tf.Session()
print (sess.run(hello))

Here is a heavier example that is very slow to start: (the inception model which I refer to)

git clone GitHub - tensorflow/models: Models and examples built with TensorFlow

cd models/tutorials/image/imagenet
python classify_image.py

(note that on first run it downloads the model in to /tmp)

On my board this fails most of the time. However the above helloworld is not relevant except that there seems to be a memory “leak” and strange behavior related to consecutive runs improving the startup time.

Best regards,
Kalevi

AastaLLL · February 13, 2018, 7:46am

Hi,

TensorFlow will generate some CUDA PTX code at the beginning.
So it may take a long time when first launching.

Thanks.

kallud32qg · February 13, 2018, 9:52am

Hi Aasta,

Would you mind running the bellow python command twice. So I have an idea of what constitutes normal.
Thanks in advance,
Kalevi

–
git clone GitHub - tensorflow/models: Models and examples built with TensorFlow
cd models/tutorials/image/imagenet

time python classify_image.py

CHuang1 · February 15, 2018, 6:40pm

Here is my 2 runs using GTX 1050i board under x86 Ubuntu,

chuang@chijen-All-Series:~/ai/models/tutorials/image/imagenet$ time python classify_image.py

Downloading inception-2015-12-05.tgz 100.0%
Successfully downloaded inception-2015-12-05.tgz 88931400 bytes.
2018-02-15 10:11:25.927997: I tensorflow/core/platform/cpu_feature_guard.cc:137] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
2018-02-15 10:11:26.577493: W tensorflow/core/framework/op_def_util.cc:334] Op BatchNormWithGlobalNormalization is deprecated. It will cease to work in GraphDef version 9. Use tf.nn.batch_normalization().
giant panda, panda, panda bear, coon bear, Ailuropoda melanoleuca (score = 0.89107)
indri, indris, Indri indri, Indri brevicaudatus (score = 0.00779)
lesser panda, red panda, panda, bear cat, cat bear, Ailurus fulgens (score = 0.00296)
custard apple (score = 0.00147)
earthstar (score = 0.00117)

real 0m23.560s
user 0m3.988s
sys 0m2.500s
chuang@chijen-All-Series:~/ai/models/tutorials/image/imagenet$ time python classify_image.py
2018-02-15 10:14:55.869540: I tensorflow/core/platform/cpu_feature_guard.cc:137] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
2018-02-15 10:14:56.273840: W tensorflow/core/framework/op_def_util.cc:334] Op BatchNormWithGlobalNormalization is deprecated. It will cease to work in GraphDef version 9. Use tf.nn.batch_normalization().
giant panda, panda, panda bear, coon bear, Ailuropoda melanoleuca (score = 0.89107)
indri, indris, Indri indri, Indri brevicaudatus (score = 0.00779)
lesser panda, red panda, panda, bear cat, cat bear, Ailurus fulgens (score = 0.00296)
custard apple (score = 0.00147)
earthstar (score = 0.00117)

real 0m2.810s
user 0m3.332s
sys 0m1.324s
chuang@chijen-All-Series:~/ai/models/tutorials/image/imagenet$

This conforms to what AastaLLL described.

kallud32qg · February 16, 2018, 2:36pm

Thanks Chijen!

Best regards,
Kalevi

sagnikdps · May 19, 2018, 12:13am

Hi,

This might be a noobish question, but will the wheel file work for a Jetson TX1 board as well?

Thanks

AastaLLL · May 21, 2018, 8:27am

Hi,

You can try the wheel included here:
[url]https://github.com/peterlee0127/tensorflow-nvJetson[/url]

Thanks.

Topic		Replies	Views
Faster R-CNN: too many resources requested for launch Jetson TX2	27	7568	September 14, 2018
TensorFlow on Jetson TX2 Jetson TX2	47	20151	September 18, 2017
TensorFlow Issue - 'NonMaxSuppressionV3' in binary Jetson TX2	15	3458	March 22, 2019
TensorFlow object detection and image classification accelerated for NVIDIA Jetson Jetson TX2	25	10918	June 3, 2019
Performance of Tensorflow (1.5) on Jetson TX2 slower than expected Jetson TX2	2	2888	February 7, 2018
TensorFlow for Jetson TX2! Jetson TX2	112	49428	September 21, 2023
Install Tensorflow Object Detection API for Jetson TX2 tensorflow , ubuntu , jetson-inference	6	4543	September 17, 2021
Freeze while executing Tensorflow in a Docker container on the TX2 Jetson TX2	14	4723	July 2, 2021
Tensorflow not using GPU in Jetson TX2 Jetson TX2	11	4527	February 12, 2018
object detection failed to run on TX2, based on tensorflow/modesl Jetson TX2	13	2272	December 28, 2017

Available: TensorFlow 1.5 for Jetson TX2

I don’t have a Nvidia GPU desktop PC to try. However as you can seen from my earlier post feel free to save the following in to a file and run it:

hello = tf.constant(“helloe world”) sess = tf.Session() print (sess.run(hello))

Related topics

hello = tf.constant(“helloe world”)
sess = tf.Session()
print (sess.run(hello))