Attempt 1.
Step 1.
Downloading the dataset.
wget https://github.com/javathunderman/retinopathy-dataset/archive/master.zip
Downloading the frozen graph:
wget https://storage.googleapis.com/download.tensorflow.org/models/inception_dec_2015.zip
unzip inception_dec_2015.zip
Archive: inception_dec_2015.zip
inflating: imagenet_comp_graph_label_strings.txt
inflating: LICENSE
inflating: tensorflow_inception_graph.pb
using environment of DeepStream 5
c4e41ec4dce6 nvcr.io/nvidia/deepstream-l4t:5.0-dp-20.04-samples
docker start c4e41ec4dce6
docker exec -it c4e41ec4dce6 bash
:/opt/nvidia/deepstream/deepstream-5.0#
:/import# python3 -m tf2onnx.convert --input tensorflow_inception_graph.pb --output tensorflow_inception_graph.onnx
Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/pywrap_tensorflow.py", line 58, in <module>
from tensorflow.python.pywrap_tensorflow_internal import *
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 28, in <module>
_pywrap_tensorflow_internal = swig_import_helper()
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 24, in swig_import_helper
_mod = imp.load_module('_pywrap_tensorflow_internal', fp, pathname, description)
File "/usr/lib/python3.6/imp.py", line 243, in load_module
return load_dynamic(name, filename, file)
File "/usr/lib/python3.6/imp.py", line 343, in load_dynamic
return _load(spec)
ImportError: libcublas.so.10.0: cannot open shared object file: No such file or directory
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/lib/python3.6/runpy.py", line 183, in _run_module_as_main
mod_name, mod_spec, code = _get_module_details(mod_name, _Error)
File "/usr/lib/python3.6/runpy.py", line 109, in _get_module_details
__import__(pkg_name)
File "/usr/local/lib/python3.6/dist-packages/tf2onnx/__init__.py", line 14, in <module>
from . import verbose_logging as logging
File "/usr/local/lib/python3.6/dist-packages/tf2onnx/verbose_logging.py", line 14, in <module>
import tensorflow as tf
File "/usr/local/lib/python3.6/dist-packages/tensorflow/__init__.py", line 24, in <module>
from tensorflow.python import pywrap_tensorflow # pylint: disable=unused-import
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/__init__.py", line 49, in <module>
from tensorflow.python import pywrap_tensorflow
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/pywrap_tensorflow.py", line 74, in <module>
raise ImportError(msg)
ImportError: Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/pywrap_tensorflow.py", line 58, in <module>
from tensorflow.python.pywrap_tensorflow_internal import *
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 28, in <module>
_pywrap_tensorflow_internal = swig_import_helper()
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 24, in swig_import_helper
_mod = imp.load_module('_pywrap_tensorflow_internal', fp, pathname, description)
File "/usr/lib/python3.6/imp.py", line 243, in load_module
return load_dynamic(name, filename, file)
File "/usr/lib/python3.6/imp.py", line 343, in load_dynamic
return _load(spec)
ImportError: libcublas.so.10.0: cannot open shared object file: No such file or directory
Failed to load the native TensorFlow runtime.
See https://www.tensorflow.org/install/errors
for some common reasons and solutions. Include the entire stack trace
above this error message when asking for help.
further it got into cusolver issue that belong to read only system thus can not be adjusted as in the example above with symlink.
Trying different container from NGX ML-tensorflow
/import# python3 -m tf2onnx.convert --input tensorflow_inception_graph.pb --output tensorflow_inception_graph.onnx
2020-09-03 10:48:07.690127: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.2
2020-09-03 10:48:15.147892: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcuda.so.1
2020-09-03 10:48:15.153429: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:948] ARM64 does not support NUMA - returning NUMA node zero
2020-09-03 10:48:15.153618: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1561] Found device 0 with properties:
pciBusID: 0000:00:00.0 name: Xavier computeCapability: 7.2
coreClock: 1.109GHz coreCount: 6 deviceMemorySize: 7.59GiB deviceMemoryBandwidth: 66.10GiB/s
2020-09-03 10:48:15.153712: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.2
2020-09-03 10:48:15.224170: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcublas.so.10
2020-09-03 10:48:15.305383: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcufft.so.10
2020-09-03 10:48:15.417929: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcurand.so.10
2020-09-03 10:48:15.554397: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcusolver.so.10
2020-09-03 10:48:15.631347: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcusparse.so.10
2020-09-03 10:48:15.632692: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudnn.so.8
2020-09-03 10:48:15.633105: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:948] ARM64 does not support NUMA - returning NUMA node zero
2020-09-03 10:48:15.633556: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:948] ARM64 does not support NUMA - returning NUMA node zero
2020-09-03 10:48:15.633783: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1703] Adding visible gpu devices: 0
2020-09-03 10:48:15.669545: W tensorflow/core/platform/profile_utils/cpu_utils.cc:106] Failed to find bogomips or clock in /proc/cpuinfo; cannot determine CPU frequency
2020-09-03 10:48:15.670181: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x4f27350 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-09-03 10:48:15.670261: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version
2020-09-03 10:48:15.835514: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:948] ARM64 does not support NUMA - returning NUMA node zero
2020-09-03 10:48:15.836804: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x5093da0 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2020-09-03 10:48:15.836950: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Xavier, Compute Capability 7.2
2020-09-03 10:48:15.863081: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:948] ARM64 does not support NUMA - returning NUMA node zero
2020-09-03 10:48:15.863471: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1561] Found device 0 with properties:
pciBusID: 0000:00:00.0 name: Xavier computeCapability: 7.2
coreClock: 1.109GHz coreCount: 6 deviceMemorySize: 7.59GiB deviceMemoryBandwidth: 66.10GiB/s
2020-09-03 10:48:15.863658: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.2
2020-09-03 10:48:15.864142: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcublas.so.10
2020-09-03 10:48:15.864439: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcufft.so.10
2020-09-03 10:48:15.864609: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcurand.so.10
2020-09-03 10:48:15.864705: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcusolver.so.10
2020-09-03 10:48:15.864817: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcusparse.so.10
2020-09-03 10:48:15.864893: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudnn.so.8
2020-09-03 10:48:15.865351: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:948] ARM64 does not support NUMA - returning NUMA node zero
2020-09-03 10:48:15.865715: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:948] ARM64 does not support NUMA - returning NUMA node zero
2020-09-03 10:48:15.865973: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1703] Adding visible gpu devices: 0
2020-09-03 10:48:15.866410: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.2
2020-09-03 10:48:22.266070: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-09-03 10:48:22.266233: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1108] 0
2020-09-03 10:48:22.266291: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1121] 0: N
2020-09-03 10:48:22.266850: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:948] ARM64 does not support NUMA - returning NUMA node zero
2020-09-03 10:48:22.267934: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:948] ARM64 does not support NUMA - returning NUMA node zero
2020-09-03 10:48:22.268273: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1247] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 2525 MB memory) -> physical GPU (device: 0, name: Xavier, pci bus id: 0000:00:00.0, compute capability: 7.2)
2020-09-03 10:48:23.903300: W tensorflow/core/framework/op_def_util.cc:371] Op BatchNormWithGlobalNormalization is deprecated. It will cease to work in GraphDef version 9. Use tf.nn.batch_normalization().
Traceback (most recent call last):
File "/usr/lib/python3.6/runpy.py", line 193, in _run_module_as_main
"__main__", mod_spec)
File "/usr/lib/python3.6/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/usr/local/lib/python3.6/dist-packages/tf2onnx/convert.py", line 171, in <module>
main()
File "/usr/local/lib/python3.6/dist-packages/tf2onnx/convert.py", line 125, in main
graph_def, inputs, outputs = tf_loader.from_graphdef(args.graphdef, args.inputs, args.outputs)
File "/usr/local/lib/python3.6/dist-packages/tf2onnx/tf_loader.py", line 150, in from_graphdef
frozen_graph = freeze_session(sess, input_names=input_names, output_names=output_names)
File "/usr/local/lib/python3.6/dist-packages/tf2onnx/tf_loader.py", line 113, in freeze_session
output_node_names = [i.split(':')[:-1][0] for i in output_names]
TypeError: 'NoneType' object is not iterable
Probably I should try with tensorflow 1.4?
Obviously first attemppt to us the instruction to install specific version 1.4.0 of tensoflow fails using
pip3 install --extra-index-url https://developer.download.nvidia.com/compute/redist/jp/v44 tensorflow==1.4.0+nv20.08
Looking in indexes: https://pypi.org/simple, https://developer.download.nvidia.com/compute/redist/jp/v44
ERROR: Could not find a version that satisfies the requirement tensorflow==1.4.0+nv20.08 (from versions: 1.15.2+nv20.4, 1.15.2+nv20.6, 1.15.3+nv20.7, 1.15.3+nv20.8, 2.1.0+nv20.4, 2.2.0+nv20.6, 2.2.0+nv20.7, 2.2.0+nv20.8)
ERROR: No matching distribution found for tensorflow==1.4.0+nv20.08
Since it doesn’t appear possible to get the version 1.4.0, I have to use
$ sudo pip3 install --pre --extra-index-url https://developer.download.nvidia.com/compute/redist/jp/v44 ‘tensorflow<2’