Tensorflow 1.11.0 (via whl) - Jetpack3.3

Hello,
I recently trained a resnet101 on tensorflow 1.11.0 through a tesla K80 GPU and installed on tx2 TF 1.11.0 via the whl file (tensorflow-1.11.0-cp35-cp35m-linux_aarch64.whl) supplied by Nvidia. However, when I try to run inferences on this network I always catch this exception (process killed) : failed to create cublas handle…

Pls find below the session init (load frozen model + inference) :

-- Load model --
2019-04-08 14:19:28.388289: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:931] ARM64 does not support NUMA - returning NUMA node zero
2019-04-08 14:19:28.388454: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1411] Found device 0 with properties: 
name: NVIDIA Tegra X2 major: 6 minor: 2 memoryClockRate(GHz): 1.3005
pciBusID: 0000:00:00.0
totalMemory: 7.67GiB freeMemory: 3.02GiB
2019-04-08 14:19:28.388508: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1490] Adding visible gpu devices: 0
2019-04-08 14:19:29.699067: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-04-08 14:19:29.699176: I tensorflow/core/common_runtime/gpu/gpu_device.cc:977]      0 
2019-04-08 14:19:29.699204: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990] 0:   N 
2019-04-08 14:19:29.699401: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1103] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 3141 MB memory) -> physical GPU (device: 0, name: NVIDIA Tegra X2, pci bus id: 0000:00:00.0, compute capability: 6.2)

-- Inference --
2019-04-08 14:19:48.314675: W tensorflow/core/framework/allocator.cc:113] Allocation of 18874368 exceeds 10% of system memory.
2019-04-08 14:19:48.332104: W tensorflow/core/framework/allocator.cc:113] Allocation of 8388608 exceeds 10% of system memory.
2019-04-08 14:19:48.335623: W tensorflow/core/framework/allocator.cc:113] Allocation of 9437184 exceeds 10% of system memory.
2019-04-08 14:19:48.338348: W tensorflow/core/framework/allocator.cc:113] Allocation of 4194304 exceeds 10% of system memory.
2019-04-08 14:19:48.340557: W tensorflow/core/framework/allocator.cc:113] Allocation of 4194304 exceeds 10% of system memory.
2019-04-08 14:19:58.966366: E tensorflow/stream_executor/cuda/cuda_blas.cc:464] failed to create cublas handle: CUBLAS_STATUS_NOT_INITIALIZED
2019-04-08 14:19:59.080731: E tensorflow/stream_executor/cuda/cuda_blas.cc:464] failed to create cublas handle: CUBLAS_STATUS_NOT_INITIALIZED
2019-04-08 14:19:59.516636: E tensorflow/stream_executor/cuda/cuda_dnn.cc:353] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR

I installed the whl file as follow :

sudo -H pip3 install tensorflow-1.11.0-cp35-cp35m-linux_aarch64.whl

I checked the version to be sure :

nvidia@tegra-ubuntu:~$ python3 -c 'import tensorflow as tf; print(tf.__version__)'
1.11.0

In python I settled the fraction of gpu memory to .4 (but tried to .8 w/ same behavior)

self.config = tf.ConfigProto();
self.config.allow_soft_placement = True
self.config.gpu_options.per_process_gpu_memory_fraction = 0.4

And initialized the session as follow :

with tf.device("/gpu:0"):
    self.detection_graph = tf.Graph()
        with tf.Session(config=self.config) as sess:

pls note I’ve also tried the other method through

allow_growth=True

with a corresponding swap file, but even this method does not work. I confirm the memory is growing and swap file as well.

I read somewhere that tf 1.11.0 expect cudnn 7.2, however, on my system the cudnn version is 7.1.5

nvidia@tegra-ubuntu:~$ dpkg   -l | grep cuda
ii  cuda-command-line-tools-9-0                9.0.252-1                                     arm64        CUDA command-line tools
ii  cuda-core-9-0                              9.0.252-1                                     arm64        CUDA core tools
ii  cuda-cublas-9-0                            9.0.252-1                                     arm64        CUBLAS native runtime libraries
ii  cuda-cublas-dev-9-0                        9.0.252-1                                     arm64        CUBLAS native dev links, headers
ii  cuda-cudart-9-0                            9.0.252-1                                     arm64        CUDA Runtime native Libraries
ii  cuda-cudart-dev-9-0                        9.0.252-1                                     arm64        CUDA Runtime native dev links, headers
ii  cuda-cufft-9-0                             9.0.252-1                                     arm64        CUFFT native runtime libraries
ii  cuda-cufft-dev-9-0                         9.0.252-1                                     arm64        CUFFT native dev links, headers
ii  cuda-curand-9-0                            9.0.252-1                                     arm64        CURAND native runtime libraries
ii  cuda-curand-dev-9-0                        9.0.252-1                                     arm64        CURAND native dev links, headers
ii  cuda-cusolver-9-0                          9.0.252-1                                     arm64        CUDA solver native runtime libraries
ii  cuda-cusolver-dev-9-0                      9.0.252-1                                     arm64        CUDA solver native dev links, headers
ii  cuda-cusparse-9-0                          9.0.252-1                                     arm64        CUSPARSE native runtime libraries
ii  cuda-cusparse-dev-9-0                      9.0.252-1                                     arm64        CUSPARSE native dev links, headers
ii  cuda-documentation-9-0                     9.0.252-1                                     arm64        CUDA documentation
ii  cuda-driver-dev-9-0                        9.0.252-1                                     arm64        CUDA Driver native dev stub library
ii  cuda-libraries-dev-9-0                     9.0.252-1                                     arm64        CUDA Libraries 9.0 development meta-package
ii  cuda-license-9-0                           9.0.252-1                                     arm64        CUDA licenses
ii  cuda-misc-headers-9-0                      9.0.252-1                                     arm64        CUDA miscellaneous headers
ii  cuda-npp-9-0                               9.0.252-1                                     arm64        NPP native runtime libraries
ii  cuda-npp-dev-9-0                           9.0.252-1                                     arm64        NPP native dev links, headers
ii  cuda-nvgraph-9-0                           9.0.252-1                                     arm64        NVGRAPH native runtime libraries
ii  cuda-nvgraph-dev-9-0                       9.0.252-1                                     arm64        NVGRAPH native dev links, headers
ii  cuda-nvml-dev-9-0                          9.0.252-1                                     arm64        NVML native dev links, headers
ii  cuda-nvrtc-9-0                             9.0.252-1                                     arm64        NVRTC native runtime libraries
ii  cuda-nvrtc-dev-9-0                         9.0.252-1                                     arm64        NVRTC native dev links, headers
ii  cuda-repo-l4t-9-0-local                    9.0.252-1                                     arm64        cuda repository configuration files
ii  cuda-samples-9-0                           9.0.252-1                                     arm64        CUDA example applications
ii  cuda-toolkit-9-0                           9.0.252-1                                     arm64        CUDA Toolkit 9.0 meta-package
ii  libcudnn7                                  7.1.5.14-1+cuda9.0                            arm64        cuDNN runtime libraries
ii  libcudnn7-dev                              7.1.5.14-1+cuda9.0                            arm64        cuDNN development libraries and headers
ii  libcudnn7-doc                              7.1.5.14-1+cuda9.0                            arm64        cuDNN documents and samples
ii  libgie-dev                                 4.1.3-1+cuda9.0                               arm64        Transitional package
ii  libnvinfer-dev                             4.1.3-1+cuda9.0                               arm64        TensorRT development libraries and headers
ii  libnvinfer-samples                         4.1.3-1+cuda9.0                               arm64        TensorRT samples and documentation
ii  libnvinfer4                                4.1.3-1+cuda9.0                               arm64        TensorRT runtime libraries
ii  tensorrt                                   4.0.2.0-1+cuda9.0                             arm64        Meta package of TensorRT

Is someone from Nvidia could help on this ?

Found…

~/.nv

was owned by root.

For precaution :

rm -r

this folder fixed the issue