Tensorflow 1.6 not working with Jetpack 3.2

I installed tensorflow 1.6 with Jetpack 3.2 as outlined here: https://gist.github.com/vellamike/7c26158c93e89ef155c1cc953bbba956

however when I try and run the following trivial example:

import tensorflow as tf
# Creates a graph.
print('a')
a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a')
print('b')
b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b')
print('c')
c = tf.matmul(a, b)
print('d')

# Creates a session with log_device_placement set to True.
print('e')
sess = tf.Session()#config=tf.ConfigProto(log_device_placement=True))
## Runs the op.
print('f')
print(sess.run(c))

I get the following error over and over again:

2016-05-06 05:43:48.865368: E tensorflow/stream_executor/cuda/cuda_driver.cc:967] failed to alloc 1048576 bytes on host: CUDA_ERROR_UNKNOWN
2016-05-06 05:43:48.865480: W ./tensorflow/core/common_runtime/gpu/pool_allocator.h:195] could not allocate pinned host memory of size: 1048576
2016-05-06 05:43:48.865508: E tensorflow/stream_executor/cuda/cuda_driver.cc:967] failed to alloc 943872 bytes on host: CUDA_ERROR_UNKNOWN
2016-05-06 05:43:48.865532: W ./tensorflow/core/common_runtime/gpu/pool_allocator.h:195] could not allocate pinned host memory of size: 943872
2016-05-06 05:43:48.865568: E tensorflow/stream_executor/cuda/cuda_driver.cc:967] failed to alloc 849664 bytes on host: CUDA_ERROR_UNKNOWN
2016-05-06 05:43:48.865619: W ./tensorflow/core/common_runtime/gpu/pool_allocator.h:195] could not allocate pinned host memory of size: 849664
2016-05-06 05:43:48.865644: E tensorflow/stream_executor/cuda/cuda_driver.cc:967] failed to alloc 764928 bytes on host: CUDA_ERROR_UNKNOWN
2016-05-06 05:43:48.865667: W ./tensorflow/core/common_runtime/gpu/pool_allocator.h:195] could not allocate pinned host memory of size: 764928

Hi,

Thanks for your feedback.
We are checking this issue internally. Will update information to you later.

Thank you @AsataLL - It will be interesting to see if you are able to reproduce the problem internally.

If you are able to get this TF1.6 to run with Jetpack 3.2 can you provide detailed instructions of how you did this? Also if possible would you be able to provide a Python wheel? I am using Python 3.

Hi,

We can reproduce this error internally.
Guess that there is an authority issue since we can launch tf session successfully before rebooting.

Our script and pip wheel can be found here(python2 only):

Will update information with you later.

Thanks.

@AastaLLL - thank you for your support. Please let me know when you have found a resolution to this issue.

Mike

Hi,

We have tested TF-1.6rc1 on JetPack3.1 and it work correctly. (Previous is TF-1.6rc0)

Could you help to check TF-1.6rc1 on JetPack3.2 DP?
You can build it with this script:

Thanks.

Hello all

I built the 1.6rc0 release with JetPack 3.2 yesterday and also had problems like above when running a network.

Thanks for providing the scripts updated for the latest version JetPack!

I just rebuilt everything using the latest version on the 1.6rc1 branch and it works perfectly with JetPack 3.2 DP.

It seems I was a bit too quick to celebrate. Sometimes the same error presents itself. It’s intermittent and very difficult to pinpoint.

At first I got it after a reboot. Later after attempting to free up memory to fit a larger model.

I could run my tests, but the stability is abysmal.

Dear @AstaLLL I am afraid that I agree with @Hallon - the script (tf1.6_install_wheel.sh) does not work correctly. When I first ran the script it worked. However, after reboot, when running even a very simple operation such as the following:

>>> import tensorflow as tf
>>> a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a')
>>> b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b')
>>> c = tf.matmul(a,b)
>>> sess = tf.Session()
2016-05-06 05:45:36.431471: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:859] ARM64 does not support NUMA - returning NUMA node zero
2016-05-06 05:45:36.431593: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1208] Found device 0 with properties: 
name: NVIDIA Tegra X2 major: 6 minor: 2 memoryClockRate(GHz): 1.3005
pciBusID: 0000:00:00.0
totalMemory: 7.67GiB freeMemory: 5.56GiB
2016-05-06 05:45:36.431662: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1308] Adding visible gpu devices: 0
2016-05-06 05:45:37.126531: I tensorflow/core/common_runtime/gpu/gpu_device.cc:989] Creating TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 5208 MB memory) -> physical GPU (device: 0, name: NVIDIA Tegra X2, pci bus id: 0000:00:00.0, compute capability: 6.2)
>>> sess.run(c)

I get errors which look like this:

2016-05-06 05:46:07.460227: W ./tensorflow/core/common_runtime/gpu/pool_allocator.h:195] could not allocate pinned host memory of size: 2304
2016-05-06 05:46:07.460244: E tensorflow/stream_executor/cuda/cuda_driver.cc:967] failed to alloc 2304 bytes on host: CUDA_ERROR_UNKNOWN
2016-05-06 05:46:07.460261: W ./tensorflow/core/common_runtime/gpu/pool_allocator.h:195] could not allocate pinned host memory of size: 2304
2016-05-06 05:46:07.460279: E tensorflow/stream_executor/cuda/cuda_driver.cc:967] failed to alloc 2304 bytes on host: CUDA_ERROR_UNKNOWN
2016-05-06 05:46:07.460295: W ./tensorflow/core/common_runtime/gpu/pool_allocator.h:195] could not allocate pinned host memory of size: 2304
2016-05-06 05:46:07.460313: E tensorflow/stream_executor/cuda/cuda_driver.cc:967] failed to alloc 2304 bytes on host: CUDA_ERROR_UNKNOWN
2016-05-06 05:46:07.460336: W ./tensorflow/core/common_runtime/gpu/pool_allocator.h:195] could not allocate pinned host memory of size: 2304
2016-05-06 05:46:07.460354: E tensorflow/stream_executor/cuda/cuda_driver.cc:967] failed to alloc 2304 bytes on host: CUDA_ERROR_UNKNOWN
2016-05-06 05:46:07.460371: W ./tensorflow/core/common_runtime/gpu/pool_allocator.h:195] could not allocate pinned host memory of size: 2304
2016-05-06 05:46:07.460388: E tensorflow/stream_executor/cuda/cuda_driver.cc:967] failed to alloc 2304 bytes on host: CUDA_ERROR_UNKNOWN
2016-05-06 05:46:07.460442: W ./tensorflow/core/common_runtime/gpu/pool_allocator.h:195] could not allocate pinned host memory of size: 2304
2016-05-06 05:46:07.460463: E tensorflow/stream_executor/cuda/cuda_driver.cc:967] failed to alloc 2304 bytes on host: CUDA_ERROR_UNKNOWN
2016-05-06 05:46:07.460480: W ./tensorflow/core/common_runtime/gpu/pool_allocator.h:195] could not allocate pinned host memory of size: 2304
2016-05-06 05:46:07.460497: E tensorflow/stream_executor/cuda/cuda_driver.cc:967] failed to alloc 2304 bytes on host: CUDA_ERROR_UNKNOWN
2016-05-06 05:46:07.460514: W ./tensorflow/core/common_runtime/gpu/pool_allocator.h:195] could not allocate pinned host memory of size: 2304
2016-05-06 05:46:07.460531: E tensorflow/stream_executor/cuda/cuda_driver.cc:967] failed to alloc 2304 bytes on host: CUDA_ERROR_UNKNOWN
2016-05-06 05:46:07.460547: W ./tensorflow/core/common_runtime/gpu/pool_allocator.h:195] could not allocate pinned host memory of size: 2304
2016-05-06 05:46:07.460565: E tensorflow/stream_executor/cuda/cuda_driver.cc:967] failed to alloc 2304 bytes on host: CUDA_ERROR_UNKNOWN
2016-05-06 05:46:07.460581: W ./tensorflow/core/common_runtime/gpu/pool_allocator.h:195] could not allocate pinned host memory of size: 2304
2016-05-06 05:46:07.460599: E tensorflow/stream_executor/cuda/cuda_driver.cc:967] failed to alloc 2304 bytes on host: CUDA_ERROR_UNKNOWN
2016-05-06 05:46:07.460615: W ./tensorflow/core/common_runtime/gpu/pool_allocator.h:195] could not allocate pinned host memory of size: 2304
2016-05-06 05:46:07.460633: E tensorflow/stream_executor/cuda/cuda_driver.cc:967] failed to alloc 2304 bytes on host: CUDA_ERROR_UNKNOWN
2016-05-06 05:46:07.460649: W ./tensorflow/core/common_runtime/gpu/pool_allocator.h:195] could not allocate pinned host memory of size: 2304
2016-05-06 05:46:07.460667: E tensorflow/stream_executor/cuda/cuda_driver.cc:967] failed to alloc 2304 bytes on host: CUDA_ERROR_UNKNOWN
2016-05-06 05:46:07.460683: W ./tensorflow/core/common_runtime/gpu/pool_allocator.h:195] could not allocate pinned host memory of size: 2304
2016-05-06 05:46:07.460701: E tensorflow/stream_executor/cuda/cuda_driver.cc:967] failed to alloc 2304 bytes on host: CUDA_ERROR_UNKNOWN
2016-05-06 05:46:07.460717: W ./tensorflow/core/common_runtime/gpu/pool_allocator.h:195] could not allocate pinned host memory of size: 2304
2016-05-06 05:46:07.460735: E tensorflow/stream_executor/cuda/cuda_driver.cc:967] failed to alloc 2304 bytes on host: CUDA_ERROR_UNKNOWN
2016-05-06 05:46:07.460751: W ./tensorflow/core/common_runtime/gpu/pool_allocator.h:195] could not allocate pinned host memory of size: 2304

Can you please reboot the jetson where you ran your install script and try my code to verify that you observe the same issue as me?

Hi,

We have checked this on both JetPack3.1 and JetPack3.2 DP.
This error only occurs with CUDA 9.0.

Not sure if any issue on allocating memory with CUDA 9.0.
We are checking this with internal CUDA team and will update information later.

Thanks.

@AastaLLL thank you for confirming that you are able to reproduce the error. It’s interesting that it’s CUDA 9.0 related. I look forward to the resolution on this.

Mike

@AastaLLL

Thank you for the feedback!

I also look forward to hearing about a resolution to the problem!

Any updates on the issue ? I just encountered this today.

Hi,

Could you try if memory allocation works by enabling the gpu_options.allow_growth option?

config = tf.ConfigProto()
config.gpu_options.allow_growth = True

session = tf.Session(config=config, ...)

Please let us know the result.
Thanks.

@AastaLLL this appears to fix the problem! My understanding is that the allow_growth option increases graphics memory dynamically and as needed, rather than mapping all the available GPU memory to the tensorflow process. What is the original cause of the problem and what is the reason that this is solving the issue?

Hi,

Per Tensor RT documentation:
------
by default it will try to allocate all the available GPU memory.
------

On fresh boot the available memory will be very high (6.2 GB).
On iGPU environment, such a huge memory allocation will fail in general as host and GPU share the same memory.
The workaround restrict the amount of memory allocation hence it passes.

Thanks.

Hello;

which file do i edit so I can enable the gpu_options.allow_growth ?

@RichieA You make this change in your tensorflow script not in any file in particular:

config = tf.ConfigProto()
config.gpu_options.allow_growth = True
session = tf.Session(config=config, ...)

Ah thank you! This fixed my issue too!

Any ide of how to apply the config.gpuoptions.allow_growth = True on tensorflow C API? I have the same problem trying to make an inference on C.