I am running into stability issues using TensorFlow 1.7 with Jetpack 3.2 installed.
What is happening is that I can boot up, create a virtualenv with all my dependencies and install TensorFlow 1.7 with the wheel. Everything will work fine, however it’s when I reboot that I begin to run into problems. When I reactivate my virtualenv and try to run the program again, I will get an unknown error. Here’s an example:
(DeepSpeaker) nvidia@tegra-ubuntu:~/MultimodalID/DeepSpeaker$ python DeepSpeaker.py
/home/nvidia/.virtualenvs/DeepSpeaker/lib/python3.5/site-packages/pydub/utils.py:165: RuntimeWarning: Couldn't find ffmpeg or avconv - defaulting to ffmpeg, but may not work
warn("Couldn't find ffmpeg or avconv - defaulting to ffmpeg, but may not work", RuntimeWarning)
2018-05-22 00:26:15.575820: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:865] ARM64 does not support NUMA - returning NUMA node zero
2018-05-22 00:26:15.575941: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1344] Found device 0 with properties:
name: NVIDIA Tegra X2 major: 6 minor: 2 memoryClockRate(GHz): 1.3005
pciBusID: 0000:00:00.0
totalMemory: 7.67GiB freeMemory: 5.75GiB
2018-05-22 00:26:15.575990: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1423] Adding visible gpu devices: 0
2018-05-22 00:26:17.443806: I tensorflow/core/common_runtime/gpu/gpu_device.cc:911] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-05-22 00:26:17.443881: I tensorflow/core/common_runtime/gpu/gpu_device.cc:917] 0
2018-05-22 00:26:17.443907: I tensorflow/core/common_runtime/gpu/gpu_device.cc:930] 0: N
2018-05-22 00:26:17.444078: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1041] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 5178 MB memory) -> physical GPU (device: 0, name: NVIDIA Tegra X2, pci bus id: 0000:00:00.0, compute capability: 6.2)
2018-05-22 00:26:17.881770: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1423] Adding visible gpu devices: 0
2018-05-22 00:26:17.881873: E tensorflow/core/common_runtime/direct_session.cc:167] Internal: CUDA runtime implicit initialization on GPU:0 failed. Status: unknown error
Traceback (most recent call last):
File "DeepSpeaker.py", line 105, in <module>
startsec=0, endsec=12, num_clips=9)
File "/home/nvidia/MultimodalID/DeepSpeaker/experiments/inference_pipe.py", line 48, in run_twophase_inference
sess2 = tf.Session(graph=g2)
File "/home/nvidia/.virtualenvs/DeepSpeaker/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1509, in __init__
super(Session, self).__init__(target, graph, config=config)
File "/home/nvidia/.virtualenvs/DeepSpeaker/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 638, in __init__
self._session = tf_session.TF_NewDeprecatedSession(opts, status)
File "/home/nvidia/.virtualenvs/DeepSpeaker/lib/python3.5/site-packages/tensorflow/python/framework/errors_impl.py", line 516, in __exit__
c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.InternalError: Failed to create session.
The solution is for me to simply pip uninstall tensorflow
and then reinstall the wheel, and after that the program runs normally.