Keras on Jetson TK1

AastaLLL · June 21, 2017, 11:55am

Hi,

I tried both tensorflow and Keras today. Both are good.
Actually, it’s hard for us to make sure all the 3party libraries run correctly on our platform.
It’s recommended to ask libraries developer for details since they are more familiar with their source.

Tensorflow: topic_1011135_tensorflow.zip(same as #11)

nvidia@tegra-ubuntu:~$ python topic_1011135_tensorflow.py 
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcublas.so.8.0 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcudnn.so.5 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcufft.so.8.0 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcurand.so.8.0 locally
Extracting MNIST_data/train-images-idx3-ubyte.gz
Extracting MNIST_data/train-labels-idx1-ubyte.gz
Extracting MNIST_data/t10k-images-idx3-ubyte.gz
Extracting MNIST_data/t10k-labels-idx1-ubyte.gz
Download Done!
I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:874] ARM has no NUMA node, hardcoding to return zero
I tensorflow/core/common_runtime/gpu/gpu_device.cc:885] Found device 0 with properties: 
name: GP10B
major: 6 minor: 2 memoryClockRate (GHz) 1.3005
pciBusID 0000:00:00.0
Total memory: 7.67GiB
Free memory: 2.92GiB
I tensorflow/core/common_runtime/gpu/gpu_device.cc:906] DMA: 0 
I tensorflow/core/common_runtime/gpu/gpu_device.cc:916] 0:   Y 
I tensorflow/core/common_runtime/gpu/gpu_device.cc:975] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GP10B, pci bus id: 0000:00:00.0)
WARNING:tensorflow:From topic_1011135_tensorflow.py:67: initialize_all_variables (from tensorflow.python.ops.variables) is deprecated and will be removed after 2017-03-02.
Instructions for updating:
Use `tf.global_variables_initializer` instead.
I tensorflow/compiler/xla/service/platform_util.cc:58] platform CUDA present with 1 visible devices
I tensorflow/compiler/xla/service/platform_util.cc:58] platform Host present with 4 visible devices
I tensorflow/compiler/xla/service/service.cc:180] XLA service executing computations on platform Host. Devices:
I tensorflow/compiler/xla/service/service.cc:187]   StreamExecutor device (0): <undefined>, <undefined>
I tensorflow/compiler/xla/service/platform_util.cc:58] platform CUDA present with 1 visible devices
I tensorflow/compiler/xla/service/platform_util.cc:58] platform Host present with 4 visible devices
I tensorflow/compiler/xla/service/service.cc:180] XLA service executing computations on platform CUDA. Devices:
I tensorflow/compiler/xla/service/service.cc:187]   StreamExecutor device (0): GP10B, Compute Capability 6.2
step 0, training accuracy 0.02
step 100, training accuracy 0.82
step 200, training accuracy 0.98
step 300, training accuracy 0.84
step 400, training accuracy 0.98
step 500, training accuracy 0.9
step 600, training accuracy 0.98
step 700, training accuracy 0.92
step 800, training accuracy 0.88
step 900, training accuracy 0.98
step 1000, training accuracy 0.98
step 1100, training accuracy 1
step 1200, training accuracy 0.94
step 1300, training accuracy 0.98
step 1400, training accuracy 0.98
step 1500, training accuracy 0.92
step 1600, training accuracy 0.98
step 1700, training accuracy 0.94
step 1800, training accuracy 1
step 1900, training accuracy 0.96
step 2000, training accuracy 0.98
step 2100, training accuracy 0.98
step 2200, training accuracy 0.94
step 2300, training accuracy 1
step 2400, training accuracy 0.96
step 2500, training accuracy 0.94

Keras: topic_1011135_keras.zip

nvidia@tegra-ubuntu:~$ python topic_1011135_keras.py 
Using TensorFlow backend.
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcublas.so.8.0 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcudnn.so.5 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcufft.so.8.0 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcurand.so.8.0 locally
x_train shape: (60000, 28, 28, 1)
60000 train samples
10000 test samples
Train on 60000 samples, validate on 10000 samples
Epoch 1/12
I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:874] ARM has no NUMA node, hardcoding to return zero
I tensorflow/core/common_runtime/gpu/gpu_device.cc:885] Found device 0 with properties: 
name: GP10B
major: 6 minor: 2 memoryClockRate (GHz) 1.3005
pciBusID 0000:00:00.0
Total memory: 7.67GiB
Free memory: 4.21GiB
I tensorflow/core/common_runtime/gpu/gpu_device.cc:906] DMA: 0 
I tensorflow/core/common_runtime/gpu/gpu_device.cc:916] 0:   Y 
I tensorflow/core/common_runtime/gpu/gpu_device.cc:975] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GP10B, pci bus id: 0000:00:00.0)
E tensorflow/stream_executor/cuda/cuda_driver.cc:1002] failed to allocate 3.92G (4210061312 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
I tensorflow/compiler/xla/service/platform_util.cc:58] platform CUDA present with 1 visible devices
I tensorflow/compiler/xla/service/platform_util.cc:58] platform Host present with 4 visible devices
I tensorflow/compiler/xla/service/service.cc:180] XLA service executing computations on platform Host. Devices:
I tensorflow/compiler/xla/service/service.cc:187]   StreamExecutor device (0): <undefined>, <undefined>
I tensorflow/compiler/xla/service/platform_util.cc:58] platform CUDA present with 1 visible devices
I tensorflow/compiler/xla/service/platform_util.cc:58] platform Host present with 4 visible devices
I tensorflow/compiler/xla/service/service.cc:180] XLA service executing computations on platform CUDA. Devices:
I tensorflow/compiler/xla/service/service.cc:187]   StreamExecutor device (0): GP10B, Compute Capability 6.2
60000/60000 [==============================] - 90s - loss: 0.3444 - acc: 0.8952 - val_loss: 0.0775 - val_acc: 0.9765
Epoch 2/12
60000/60000 [==============================] - 43s - loss: 0.1169 - acc: 0.9660 - val_loss: 0.0528 - val_acc: 0.9830
Epoch 3/12
60000/60000 [==============================] - 39s - loss: 0.0885 - acc: 0.9739 - val_loss: 0.0453 - val_acc: 0.9860
Epoch 4/12
60000/60000 [==============================] - 41s - loss: 0.0742 - acc: 0.9779 - val_loss: 0.0401 - val_acc: 0.9864
Epoch 5/12
60000/60000 [==============================] - 38s - loss: 0.0646 - acc: 0.9806 - val_loss: 0.0368 - val_acc: 0.9877
Epoch 6/12
60000/60000 [==============================] - 37s - loss: 0.0576 - acc: 0.9825 - val_loss: 0.0321 - val_acc: 0.9891
Epoch 7/12
60000/60000 [==============================] - 39s - loss: 0.0526 - acc: 0.9845 - val_loss: 0.0340 - val_acc: 0.9881
Epoch 8/12
60000/60000 [==============================] - 38s - loss: 0.0498 - acc: 0.9851 - val_loss: 0.0330 - val_acc: 0.9895
Epoch 9/12
60000/60000 [==============================] - 37s - loss: 0.0469 - acc: 0.9857 - val_loss: 0.0309 - val_acc: 0.9900
Epoch 10/12
60000/60000 [==============================] - 37s - loss: 0.0427 - acc: 0.9875 - val_loss: 0.0313 - val_acc: 0.9896
Epoch 11/12
60000/60000 [==============================] - 37s - loss: 0.0404 - acc: 0.9875 - val_loss: 0.0293 - val_acc: 0.9899
Epoch 12/12
60000/60000 [==============================] - 37s - loss: 0.0402 - acc: 0.9882 - val_loss: 0.0282 - val_acc: 0.9897
Test loss: 0.0281542236623
Test accuracy: 0.9897

topic_1011135_keras.zip (1.06 KB)
topic_1011135_tensorflow.zip (1.14 KB)

CaitC · June 21, 2017, 2:43pm

Can you also include the wheel for TF and the exact install instructions you used for both TF and Keras? Thank you.

AastaLLL · June 22, 2017, 2:25am

Hi,

We follow this page to install tensorflow.
For Keras, same as the procedures mentioned in #8.

CaitC · June 26, 2017, 2:29pm

Could you please publish your TF wheel for public use on TK1?

AastaLLL · June 27, 2017, 4:43am

Hi,

It’s not suitable for us to publish a 3-party library.
Maybe you can post here:
https://github.com/tensorflow/tensorflow/issues/851

May I know your current status?
Thanks.

CaitC · June 28, 2017, 9:32pm

First, I had to find another x86+nV system to have a smaller number of training steps to make it work w/o swapfile space. Now I am going to follow the instructions for the TX2 page you sent to see if I can get a correct whl to build.

CaitC · June 28, 2017, 10:17pm

I am unable to build bazel because there is not enough storage on the TK1.
ubuntu@tegra-ubuntu:~/mybazel$ ./compile.sh
INFO: You can skip this first step by providing a path to the bazel binary as second argument:
INFO: ./compile.sh compile /path/to/bazel

CaitC · June 28, 2017, 10:28pm

ubuntu@tegra-ubuntu:~$ df
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/root 14318640 5501360 8066896 41% /

How much space is needed to build bazel?

AastaLLL · June 30, 2017, 9:20am

Hi,

Usually, I built tensorflow on a 128G SD card.
I think 16G external space will be the minimum since you also need to add some swap space when executing.

Topic		Replies	Views
Memory problem with keras-ocr? Jetson Nano jetson-inference	5	2740	October 15, 2021
Problem to install tensorflow on Xavier (Solved) Jetson AGX Xavier	19	8659	October 18, 2021
CUDA_ERROR_LAUNCH_FAILED error when running TensorFlow mnist example Jetson TX2	4	2893	December 7, 2017
Problems running tensorflow + Keras Jetson Nano	5	629	October 15, 2021
Can't run LSTM based (TF-Keras) model on Jetson Nano - Function call stack: distributed_function -> distributed_function -> distributed_function Jetson Nano	3	1660	October 14, 2021
Out of memory error from TensorFlow: any workaround for this, or do I just need a bigger boat? Jetson Nano	11	14215	June 12, 2020
Jetson Nano running out of memory: ResourceExhaustedError: OOM when allocating tensor with shape[3,3,512,1024 Jetson Nano tensorrt	16	6837	October 18, 2021
TensorFlow Issue - 'NonMaxSuppressionV3' in binary Jetson TX2	16	3150	October 18, 2021
Tensorflow Memory Error Jetson TX2	25	15310	October 18, 2021
TensorRT (TF-TRT) doesn't improve TF model in GeForce 1060? TensorRT	7	2912	January 18, 2019

Keras on Jetson TK1

Related topics