Given there is .engine file & h5, how to incorporate it into Deepstream?

yes, you need to get them outside of the model itself

@mchi
Will it be easier to incorporate into DeeppStream the following solution? Given the sources are provided as is in the documentation below? Would you be able to help with such implementation?

I think it’s feasible. I think the steps should be:

  1. convert the pytorch model to onnx
  2. configure the gie config with the onnx, and implement the detection post processing
  3. inference with video or image as input for DeepStream

Attempt 1.
Step 1.
Downloading the dataset.

wget https://github.com/javathunderman/retinopathy-dataset/archive/master.zip

Downloading the frozen graph:

wget https://storage.googleapis.com/download.tensorflow.org/models/inception_dec_2015.zip
 unzip inception_dec_2015.zip 
Archive:  inception_dec_2015.zip
  inflating: imagenet_comp_graph_label_strings.txt  
  inflating: LICENSE                 
  inflating: tensorflow_inception_graph.pb  

using environment of DeepStream 5

c4e41ec4dce6        nvcr.io/nvidia/deepstream-l4t:5.0-dp-20.04-samples 
docker start c4e41ec4dce6
docker exec -it c4e41ec4dce6 bash
:/opt/nvidia/deepstream/deepstream-5.0# 

:/import# python3 -m tf2onnx.convert --input tensorflow_inception_graph.pb  --output tensorflow_inception_graph.onnx
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/pywrap_tensorflow.py", line 58, in <module>
    from tensorflow.python.pywrap_tensorflow_internal import *
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 28, in <module>
    _pywrap_tensorflow_internal = swig_import_helper()
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 24, in swig_import_helper
    _mod = imp.load_module('_pywrap_tensorflow_internal', fp, pathname, description)
  File "/usr/lib/python3.6/imp.py", line 243, in load_module
    return load_dynamic(name, filename, file)
  File "/usr/lib/python3.6/imp.py", line 343, in load_dynamic
    return _load(spec)
ImportError: libcublas.so.10.0: cannot open shared object file: No such file or directory

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.6/runpy.py", line 183, in _run_module_as_main
    mod_name, mod_spec, code = _get_module_details(mod_name, _Error)
  File "/usr/lib/python3.6/runpy.py", line 109, in _get_module_details
    __import__(pkg_name)
  File "/usr/local/lib/python3.6/dist-packages/tf2onnx/__init__.py", line 14, in <module>
    from . import verbose_logging as logging
  File "/usr/local/lib/python3.6/dist-packages/tf2onnx/verbose_logging.py", line 14, in <module>
    import tensorflow as tf
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/__init__.py", line 24, in <module>
    from tensorflow.python import pywrap_tensorflow  # pylint: disable=unused-import
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/__init__.py", line 49, in <module>
    from tensorflow.python import pywrap_tensorflow
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/pywrap_tensorflow.py", line 74, in <module>
    raise ImportError(msg)
ImportError: Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/pywrap_tensorflow.py", line 58, in <module>
    from tensorflow.python.pywrap_tensorflow_internal import *
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 28, in <module>
    _pywrap_tensorflow_internal = swig_import_helper()
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 24, in swig_import_helper
    _mod = imp.load_module('_pywrap_tensorflow_internal', fp, pathname, description)
  File "/usr/lib/python3.6/imp.py", line 243, in load_module
    return load_dynamic(name, filename, file)
  File "/usr/lib/python3.6/imp.py", line 343, in load_dynamic
    return _load(spec)
ImportError: libcublas.so.10.0: cannot open shared object file: No such file or directory


Failed to load the native TensorFlow runtime.

See https://www.tensorflow.org/install/errors

for some common reasons and solutions.  Include the entire stack trace
above this error message when asking for help.

further it got into cusolver issue that belong to read only system thus can not be adjusted as in the example above with symlink.
Trying different container from NGX ML-tensorflow

/import# python3 -m tf2onnx.convert --input tensorflow_inception_graph.pb  --output tensorflow_inception_graph.onnx
2020-09-03 10:48:07.690127: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.2
2020-09-03 10:48:15.147892: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcuda.so.1
2020-09-03 10:48:15.153429: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:948] ARM64 does not support NUMA - returning NUMA node zero
2020-09-03 10:48:15.153618: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1561] Found device 0 with properties: 
pciBusID: 0000:00:00.0 name: Xavier computeCapability: 7.2
coreClock: 1.109GHz coreCount: 6 deviceMemorySize: 7.59GiB deviceMemoryBandwidth: 66.10GiB/s
2020-09-03 10:48:15.153712: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.2
2020-09-03 10:48:15.224170: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcublas.so.10
2020-09-03 10:48:15.305383: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcufft.so.10
2020-09-03 10:48:15.417929: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcurand.so.10
2020-09-03 10:48:15.554397: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcusolver.so.10
2020-09-03 10:48:15.631347: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcusparse.so.10
2020-09-03 10:48:15.632692: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudnn.so.8
2020-09-03 10:48:15.633105: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:948] ARM64 does not support NUMA - returning NUMA node zero
2020-09-03 10:48:15.633556: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:948] ARM64 does not support NUMA - returning NUMA node zero
2020-09-03 10:48:15.633783: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1703] Adding visible gpu devices: 0
2020-09-03 10:48:15.669545: W tensorflow/core/platform/profile_utils/cpu_utils.cc:106] Failed to find bogomips or clock in /proc/cpuinfo; cannot determine CPU frequency
2020-09-03 10:48:15.670181: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x4f27350 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-09-03 10:48:15.670261: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
2020-09-03 10:48:15.835514: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:948] ARM64 does not support NUMA - returning NUMA node zero
2020-09-03 10:48:15.836804: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x5093da0 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2020-09-03 10:48:15.836950: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Xavier, Compute Capability 7.2
2020-09-03 10:48:15.863081: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:948] ARM64 does not support NUMA - returning NUMA node zero
2020-09-03 10:48:15.863471: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1561] Found device 0 with properties: 
pciBusID: 0000:00:00.0 name: Xavier computeCapability: 7.2
coreClock: 1.109GHz coreCount: 6 deviceMemorySize: 7.59GiB deviceMemoryBandwidth: 66.10GiB/s
2020-09-03 10:48:15.863658: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.2
2020-09-03 10:48:15.864142: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcublas.so.10
2020-09-03 10:48:15.864439: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcufft.so.10
2020-09-03 10:48:15.864609: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcurand.so.10
2020-09-03 10:48:15.864705: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcusolver.so.10
2020-09-03 10:48:15.864817: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcusparse.so.10
2020-09-03 10:48:15.864893: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudnn.so.8
2020-09-03 10:48:15.865351: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:948] ARM64 does not support NUMA - returning NUMA node zero
2020-09-03 10:48:15.865715: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:948] ARM64 does not support NUMA - returning NUMA node zero
2020-09-03 10:48:15.865973: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1703] Adding visible gpu devices: 0
2020-09-03 10:48:15.866410: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.2
2020-09-03 10:48:22.266070: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-09-03 10:48:22.266233: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1108]      0 
2020-09-03 10:48:22.266291: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1121] 0:   N 
2020-09-03 10:48:22.266850: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:948] ARM64 does not support NUMA - returning NUMA node zero
2020-09-03 10:48:22.267934: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:948] ARM64 does not support NUMA - returning NUMA node zero
2020-09-03 10:48:22.268273: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1247] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 2525 MB memory) -> physical GPU (device: 0, name: Xavier, pci bus id: 0000:00:00.0, compute capability: 7.2)
2020-09-03 10:48:23.903300: W tensorflow/core/framework/op_def_util.cc:371] Op BatchNormWithGlobalNormalization is deprecated. It will cease to work in GraphDef version 9. Use tf.nn.batch_normalization().
Traceback (most recent call last):
  File "/usr/lib/python3.6/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/usr/lib/python3.6/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/usr/local/lib/python3.6/dist-packages/tf2onnx/convert.py", line 171, in <module>
    main()
  File "/usr/local/lib/python3.6/dist-packages/tf2onnx/convert.py", line 125, in main
    graph_def, inputs, outputs = tf_loader.from_graphdef(args.graphdef, args.inputs, args.outputs)
  File "/usr/local/lib/python3.6/dist-packages/tf2onnx/tf_loader.py", line 150, in from_graphdef
    frozen_graph = freeze_session(sess, input_names=input_names, output_names=output_names)
  File "/usr/local/lib/python3.6/dist-packages/tf2onnx/tf_loader.py", line 113, in freeze_session
    output_node_names = [i.split(':')[:-1][0] for i in output_names]
TypeError: 'NoneType' object is not iterable

Probably I should try with tensorflow 1.4?
Obviously first attemppt to us the instruction to install specific version 1.4.0 of tensoflow fails using

 pip3 install --extra-index-url https://developer.download.nvidia.com/compute/redist/jp/v44 tensorflow==1.4.0+nv20.08         
Looking in indexes: https://pypi.org/simple, https://developer.download.nvidia.com/compute/redist/jp/v44
ERROR: Could not find a version that satisfies the requirement tensorflow==1.4.0+nv20.08 (from versions: 1.15.2+nv20.4, 1.15.2+nv20.6, 1.15.3+nv20.7, 1.15.3+nv20.8, 2.1.0+nv20.4, 2.2.0+nv20.6, 2.2.0+nv20.7, 2.2.0+nv20.8)
ERROR: No matching distribution found for tensorflow==1.4.0+nv20.08

Since it doesn’t appear possible to get the version 1.4.0, I have to use

$ sudo pip3 install --pre --extra-index-url https://developer.download.nvidia.com/compute/redist/jp/v44 ‘tensorflow<2’

Hi @Andrey1984,
This is a new topic thay has nothing to do with original issue, please file a new topic.
And, if the original quetions you asked have been addressed, could you mark it closed?

BTW, for the onnx conversion, only pb is not enough

original concerns have’t been implemented
there is still a need to import the model or get it executed at least with tensorRT within DeepStream.
However, it seems that the subtask of implementing the intel scenario, if it works will be applicable to the original issue.
@mchi will you be able to assistt with implementing the intel scenario at the separate topic here:
which steps / components need to be added in order for the onnx conversion to get through?
new thread Implementing DeepStream/ TRT integration by Intels scenario

what concern?

the concern to get a model, similar to the one by Intel, to integrate into DeedStream using TRT.
As the Intel provides open source & full steps definition it seems to make sense to try getting the integration done with it, to see if it works given redundant sources are provided for the intel scenario including the dataset images.

You can refer to Detecting Diabetic Retinopathy Using Deep Learning on Intel®... to train the model with the dataset, or just use the model it provides.
Providing model is out of DeepStream support.

once the model is trained/ provided from the mentioned article;
the integration of the model into the DeepStream will be or will not be out of the DeepStream Support?

Depends on what the issue is.
For an example, if user customize a model which needs a customized post-processing, user should implement it by himself since DeepStream provides the inference for the post-processing.

in given scenario it is image classification model;
that just predicts if the image have the disease or not with some probability
it doesn’t imply post processing, does it?
Moreover, it seems that the model is produced by applying to the dataset the algorithm as follows:

python retrain.py \
  --bottleneck_dir=bottlenecks \
  --how_many_training_steps=300 \
  --model_dir=inception \
  --output_graph=retrained_graph.pb \
  --output_labels=retrained_labels.txt \
  --image_dir=<>

the code above seems agnostic to post processing

if so, I think it should be fine. so, no cercen, right?

after digging deeper into the Intel article it turned out that it misses many puzzles;
However, as it has dataset sources it will be just possible train a model with google AI interface.
After uploading datasets it will become visible which options they would suport for exporting the model.

1 Like

However, following the Intels article: attempt #1.

git clone https://github.com/javathunderman/diabetic-retinopathy-screening
cd diabetic-retinopathy-screening/
git clone https://github.com/Nomikxyz/retinopathy-dataset
mkdir images
cd images
mkdir diseased
mkdir nondiseased
cd ..

then copy ~250 files from folder retinopathy-dataset sympthoms to the diseased folder & ~250 images from folder nosymppthoms to non diseased folder

running the retrain script as per the Inttel’s tutorial:

 python3 retrain.py   --bottleneck_dir=bottlenecks   --how_many_training_steps=300   --model_dir=inception   --output_graph=retrained_graph.pb   --output_labels=retrained_labels.txt   --image_dir=images/
2020-09-03 21:31:50.740715: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.2
WARNING:tensorflow:Deprecation warnings have been disabled. Set TF_ENABLE_DEPRECATION_WARNINGS=1 to re-enable them.
WARNING:tensorflow:From retrain.py:1063: The name tf.app.run is deprecated. Please use tf.compat.v1.app.run instead.

WARNING:tensorflow:From retrain.py:773: The name tf.gfile.Exists is deprecated. Please use tf.io.gfile.exists instead.

W0903 21:31:57.186100 548329693200 module_wrapper.py:139] From retrain.py:773: The name tf.gfile.Exists is deprecated. Please use tf.io.gfile.exists instead.

WARNING:tensorflow:From retrain.py:774: The name tf.gfile.DeleteRecursively is deprecated. Please use tf.io.gfile.rmtree instead.

W0903 21:31:57.186951 548329693200 module_wrapper.py:139] From retrain.py:774: The name tf.gfile.DeleteRecursively is deprecated. Please use tf.io.gfile.rmtree instead.

WARNING:tensorflow:From retrain.py:775: The name tf.gfile.MakeDirs is deprecated. Please use tf.io.gfile.makedirs instead.

W0903 21:31:57.189463 548329693200 module_wrapper.py:139] From retrain.py:775: The name tf.gfile.MakeDirs is deprecated. Please use tf.io.gfile.makedirs instead.

WARNING:tensorflow:From retrain.py:248: The name tf.Session is deprecated. Please use tf.compat.v1.Session instead.

W0903 21:32:00.193557 548329693200 module_wrapper.py:139] From retrain.py:248: The name tf.Session is deprecated. Please use tf.compat.v1.Session instead.

2020-09-03 21:32:00.575117: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcuda.so.1
2020-09-03 21:32:00.680931: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:952] ARM64 does not support NUMA - returning NUMA node zero
2020-09-03 21:32:00.681140: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1634] Found device 0 with properties: 
name: Xavier major: 7 minor: 2 memoryClockRate(GHz): 1.109
pciBusID: 0000:00:00.0
2020-09-03 21:32:00.681223: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.2
2020-09-03 21:32:00.806538: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcublas.so.10
2020-09-03 21:32:00.927444: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcufft.so.10
2020-09-03 21:32:01.063370: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcurand.so.10
2020-09-03 21:32:01.133752: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcusolver.so.10
2020-09-03 21:32:01.194147: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcusparse.so.10
2020-09-03 21:32:01.248698: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudnn.so.8
2020-09-03 21:32:01.250449: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:952] ARM64 does not support NUMA - returning NUMA node zero
2020-09-03 21:32:01.252056: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:952] ARM64 does not support NUMA - returning NUMA node zero
2020-09-03 21:32:01.252207: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1762] Adding visible gpu devices: 0
2020-09-03 21:32:01.279084: W tensorflow/core/platform/profile_utils/cpu_utils.cc:98] Failed to find bogomips in /proc/cpuinfo; cannot determine CPU frequency
2020-09-03 21:32:01.279771: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x3bd50110 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-09-03 21:32:01.280047: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
2020-09-03 21:32:01.370448: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:952] ARM64 does not support NUMA - returning NUMA node zero
2020-09-03 21:32:01.371520: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x3bda7c70 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2020-09-03 21:32:01.371653: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Xavier, Compute Capability 7.2
2020-09-03 21:32:01.372743: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:952] ARM64 does not support NUMA - returning NUMA node zero
2020-09-03 21:32:01.372967: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1634] Found device 0 with properties: 
name: Xavier major: 7 minor: 2 memoryClockRate(GHz): 1.109
pciBusID: 0000:00:00.0
2020-09-03 21:32:01.373225: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.2
2020-09-03 21:32:01.373406: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcublas.so.10
2020-09-03 21:32:01.373497: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcufft.so.10
2020-09-03 21:32:01.373560: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcurand.so.10
2020-09-03 21:32:01.373710: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcusolver.so.10
2020-09-03 21:32:01.373840: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcusparse.so.10
2020-09-03 21:32:01.374012: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudnn.so.8
2020-09-03 21:32:01.374204: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:952] ARM64 does not support NUMA - returning NUMA node zero
2020-09-03 21:32:01.374432: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:952] ARM64 does not support NUMA - returning NUMA node zero
2020-09-03 21:32:01.374519: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1762] Adding visible gpu devices: 0
2020-09-03 21:32:01.374623: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.2
2020-09-03 21:32:03.030578: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1175] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-09-03 21:32:03.030881: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1181]      0 
2020-09-03 21:32:03.030990: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1194] 0:   N 
2020-09-03 21:32:03.031640: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:952] ARM64 does not support NUMA - returning NUMA node zero
2020-09-03 21:32:03.032137: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:952] ARM64 does not support NUMA - returning NUMA node zero
2020-09-03 21:32:03.032520: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1320] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 261 MB memory) -> physical GPU (device: 0, name: Xavier, pci bus id: 0000:00:00.0, compute capability: 7.2)
WARNING:tensorflow:From retrain.py:252: The name tf.GraphDef is deprecated. Please use tf.compat.v1.GraphDef instead.

W0903 21:32:03.078023 548329693200 module_wrapper.py:139] From retrain.py:252: The name tf.GraphDef is deprecated. Please use tf.compat.v1.GraphDef instead.

2020-09-03 21:32:07.218916: W tensorflow/core/framework/op_def_util.cc:357] Op BatchNormWithGlobalNormalization is deprecated. It will cease to work in GraphDef version 9. Use tf.nn.batch_normalization().
Looking for images in 'diseased'
Looking for images in 'nondiseased'
2020-09-03 21:32:08.310473: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:952] ARM64 does not support NUMA - returning NUMA node zero
2020-09-03 21:32:08.317415: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1634] Found device 0 with properties: 
name: Xavier major: 7 minor: 2 memoryClockRate(GHz): 1.109
pciBusID: 0000:00:00.0
2020-09-03 21:32:08.410976: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.2
2020-09-03 21:32:08.463758: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcublas.so.10
2020-09-03 21:32:08.463979: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcufft.so.10
2020-09-03 21:32:08.476783: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcurand.so.10
2020-09-03 21:32:08.500267: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcusolver.so.10
2020-09-03 21:32:08.523746: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcusparse.so.10
2020-09-03 21:32:08.547276: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudnn.so.8
2020-09-03 21:32:08.547591: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:952] ARM64 does not support NUMA - returning NUMA node zero
2020-09-03 21:32:08.548069: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:952] ARM64 does not support NUMA - returning NUMA node zero
2020-09-03 21:32:08.548238: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1762] Adding visible gpu devices: 0
2020-09-03 21:32:08.548783: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1175] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-09-03 21:32:08.548841: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1181]      0 
2020-09-03 21:32:08.549917: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1194] 0:   N 
2020-09-03 21:32:08.550348: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:952] ARM64 does not support NUMA - returning NUMA node zero
2020-09-03 21:32:08.550734: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:952] ARM64 does not support NUMA - returning NUMA node zero
2020-09-03 21:32:08.551027: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1320] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 261 MB memory) -> physical GPU (device: 0, name: Xavier, pci bus id: 0000:00:00.0, compute capability: 7.2)
Creating bottleneck at bottlenecks/diseased/13638_left.jpeg.txt
2020-09-03 21:32:55.810282: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudnn.so.8
2020-09-03 21:33:55.315417: E tensorflow/core/platform/posix/subprocess.cc:208] Start cannot fork() child process: Cannot allocate memory
2020-09-03 21:33:56.602337: W tensorflow/stream_executor/cuda/ptxas_utils.cc:83] Couldn't invoke /usr/local/cuda/bin/ptxas --version
2020-09-03 21:33:59.154034: E tensorflow/core/platform/posix/subprocess.cc:208] Start cannot fork() child process: Cannot allocate memory
2020-09-03 21:34:01.168701: W tensorflow/stream_executor/cuda/redzone_allocator.cc:312] Internal: Failed to launch ptxas
Relying on driver to perform ptx compilation. This message will be only logged once.
2020-09-03 21:34:32.605687: E tensorflow/core/platform/posix/subprocess.cc:208] Start cannot fork() child process: Cannot allocate memory
2020-09-03 21:34:36.945777: E tensorflow/core/platform/posix/subprocess.cc:208] Start cannot fork() child process: Cannot allocate memory
2020-09-03 21:34:37.705258: E tensorflow/core/platform/posix/subprocess.cc:208] Start cannot fork() child process: Cannot allocate memory
2020-09-03 21:34:44.310620: E tensorflow/core/platform/posix/subprocess.cc:208] Start cannot fork() child process: Cannot allocate memory
2020-09-03 21:34:46.359441: E tensorflow/core/platform/posix/subprocess.cc:208] Start cannot fork() child process: Cannot allocate memory
2020-09-03 21:34:46.717019: E tensorflow/core/platform/posix/subprocess.cc:208] Start cannot fork() child process: Cannot allocate memory
2020-09-03 21:34:46.723566: E tensorflow/core/platform/posix/subprocess.cc:208] Start cannot fork() child process: Cannot allocate memory
2020-09-03 21:34:46.803360: E tensorflow/core/platform/posix/subprocess.cc:208] Start cannot fork() child process: Cannot allocate memory
2020-09-03 21:34:46.835004: E tensorflow/core/platform/posix/subprocess.cc:208] Start cannot fork() child process: Cannot allocate memory
2020-09-03 21:34:46.873237: E tensorflow/core/platform/posix/subprocess.cc:208] Start cannot fork() child process: Cannot allocate memory
2020-09-03 21:34:46.879784: E tensorflow/core/platform/posix/subprocess.cc:208] Start cannot fork() child process: Cannot allocate memory
2020-09-03 21:34:50.410013: E tensorflow/core/platform/posix/subprocess.cc:208] Start cannot fork() child process: Cannot allocate memory
2020-09-03 21:35:00.258402: E tensorflow/core/platform/posix/subprocess.cc:208] Start cannot fork() child process: Cannot allocate memory
2020-09-03 21:36:59.440165: E tensorflow/core/platform/posix/subprocess.cc:208] Start cannot fork() child process: Cannot allocate memory
2020-09-03 21:37:29.503268: E tensorflow/core/platform/posix/subprocess.cc:208] Start cannot fork() child process: Cannot allocate memory
2020-09-03 21:37:40.754483: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcublas.so.10
Killed


After adding 8gb swap file

an addition to existent zram swap the situation seem improved & training started

Following Google AI alternative procedure:
Attempt #1:
from uploaded dataset trainig has started

Hi,
We advise you to train on dGPU instead of on Jetson device.

@Amycao
Thank you for following up!
I have access to cloud resources [ Amazon/ GCP etc]
so I trained with Google AI a model based on provided images.
The resulting file is as follows:
https://storage.googleapis.com/gaze-dev/model-555139022817591296_tf-saved-model_2020-09-08T00_12_38.738Z_saved_model.pb
also another attemp

python3 retrain.py   --bottleneck_dir=bottlenecks   --how_many_training_steps=300   --model_dir=inception   --output_graph=retrained_graph.pb   --output_labels=retrained_labels.txt   --image_dir=images/

using instruction from

resuilted in
https://storage.googleapis.com/gaze-dev/retrained_labels.txt
https://storage.googleapis.com/gaze-dev/retrained_graph.pb

However, the question is how to pass the pb into triton inference deepstreram?
reference thread Implementing DeepStream/ TRT integration by Intels scenario - #20 by _av

Please check SSD sample, sources/objectDetector_SSD, README and code.

@Amycao
Thank you for following up!
According to the instruction in the /objectDetector_SSD/README

wget https://storage.googleapis.com/gaze-dev/model-555139022817591296_tf-saved-model_2020-09-08T00_12_38.738Z_saved_model.pb
sudo apt-get install python-protobufv
# python /usr/lib/python2.7/dist-packages/uff/bin/convert_to_uff.py \
   ##      model-555139022817591296_tf-saved-model_2020-09-08T00_12_38.738Z_saved_model.pb -O NMS \
     #    -p /usr/src/tensorrt/samples/sampleUffSSD/config.py \
        # -o sample_ssd_relu6.uff
python3 /usr/lib/python3.6/dist-packages/uff/bin/convert_to_uff.py          model-555139022817591296_tf-saved-model_2020-09-08T00_12_38.738Z_saved_model.pb -O NMS          -p /usr/src/tensorrt/samples/sampleUffSSD/config.py          -o sample_ssd_relu6.uff
2020-09-21 05:49:22.542630: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.2
Loading model-555139022817591296_tf-saved-model_2020-09-08T00_12_38.738Z_saved_model.pb
Traceback (most recent call last):
  File "/usr/lib/python3.6/dist-packages/uff/bin/convert_to_uff.py", line 143, in <module>
    main()
  File "/usr/lib/python3.6/dist-packages/uff/bin/convert_to_uff.py", line 139, in main
    debug_mode=args.debug
  File "/usr/lib/python3.6/dist-packages/uff/bin/../../uff/converters/tensorflow/conversion_helpers.py", line 274, in from_tensorflow_frozen_model
    with tf.gfile.GFile(frozen_file, "rb") as frozen_pb:
AttributeError: module 'tensorflow' has no attribute 'gfile'
#Tensorflow installed with sudo pip3 install --pre --extra-index-url https://developer.download.nvidia.com/compute/redist/jp/v44 ‘tensorflow<2’
running python 2 will result in
python /usr/lib/python2.7/dist-packages/uff/bin/convert_to_uff.py          model-555139022817591296_tf-saved-model_2020-09-08T00_12_38.738Z_saved_model.pb -O NMS          -p /usr/src/tensorrt/samples/sampleUffSSD/config.py          -o sample_ssd_relu6.uff
Traceback (most recent call last):
  File "/usr/lib/python2.7/dist-packages/uff/bin/convert_to_uff.py", line 65, in <module>
    import uff
  File "/usr/lib/python2.7/dist-packages/uff/bin/../../uff/__init__.py", line 49, in <module>
    from uff.converters.tensorflow.conversion_helpers import from_tensorflow  # noqa
  File "/usr/lib/python2.7/dist-packages/uff/bin/../../uff/converters/tensorflow/conversion_helpers.py", line 59, in <module>
    from .converter_functions import *  # noqa
  File "/usr/lib/python2.7/dist-packages/uff/bin/../../uff/converters/tensorflow/converter_functions.py", line 59, in <module>
    from uff.converters.tensorflow.converter import TensorFlowToUFFConverter as tf2uff
  File "/usr/lib/python2.7/dist-packages/uff/bin/../../uff/converters/tensorflow/converter.py", line 60, in <module>
    from tensorflow.compat.v1 import AttrValue
ImportError: No module named tensorflow.compat.v1
nvidia@nvidia-desktop:~/dev$ 

shall I reinstall tensorflow to 1 version? else?

sudo pip3 install --pre --extra-index-url https://developer.download.nvidia.com/compute/redist/jp/v44 'tensorflow<2'

the issue then is different:

python3 /usr/lib/python3.6/dist-packages/uff/bin/convert_to_uff.py          model-555139022817591296_tf-saved-model_2020-09-08T00_12_38.738Z_saved_model.pb -O NMS          -p /usr/src/tensorrt/samples/sampleUffSSD/config.py          -o sample_ssd_relu6.uff
2020-09-21 06:04:57.459174: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.2
WARNING:tensorflow:Deprecation warnings have been disabled. Set TF_ENABLE_DEPRECATION_WARNINGS=1 to re-enable them.
Loading model-555139022817591296_tf-saved-model_2020-09-08T00_12_38.738Z_saved_model.pb
WARNING:tensorflow:From /usr/lib/python3.6/dist-packages/uff/bin/../../uff/converters/tensorflow/conversion_helpers.py:274: The name tf.gfile.GFile is deprecated. Please use tf.io.gfile.GFile instead.

Traceback (most recent call last):
  File "/usr/lib/python3.6/dist-packages/uff/bin/convert_to_uff.py", line 143, in <module>
    main()
  File "/usr/lib/python3.6/dist-packages/uff/bin/convert_to_uff.py", line 139, in main
    debug_mode=args.debug
  File "/usr/lib/python3.6/dist-packages/uff/bin/../../uff/converters/tensorflow/conversion_helpers.py", line 275, in from_tensorflow_frozen_model
    graphdef.ParseFromString(frozen_pb.read())
  File "/usr/local/lib/python3.6/dist-packages/google/protobuf/message.py", line 199, in ParseFromString
    return self.MergeFromString(serialized)
  File "/usr/local/lib/python3.6/dist-packages/google/protobuf/internal/python_message.py", line 1145, in MergeFromString
    if self._InternalParse(serialized, 0, length) != length:
  File "/usr/local/lib/python3.6/dist-packages/google/protobuf/internal/python_message.py", line 1212, in InternalParse
    pos = field_decoder(buffer, new_pos, end, self, field_dict)
  File "/usr/local/lib/python3.6/dist-packages/google/protobuf/internal/decoder.py", line 754, in DecodeField
    if value._InternalParse(buffer, pos, new_pos) != new_pos:
  File "/usr/local/lib/python3.6/dist-packages/google/protobuf/internal/python_message.py", line 1212, in InternalParse
    pos = field_decoder(buffer, new_pos, end, self, field_dict)
  File "/usr/local/lib/python3.6/dist-packages/google/protobuf/internal/decoder.py", line 733, in DecodeRepeatedField
    if value.add()._InternalParse(buffer, pos, new_pos) != new_pos:
  File "/usr/local/lib/python3.6/dist-packages/google/protobuf/internal/python_message.py", line 1212, in InternalParse
    pos = field_decoder(buffer, new_pos, end, self, field_dict)
  File "/usr/local/lib/python3.6/dist-packages/google/protobuf/internal/decoder.py", line 888, in DecodeMap
    if submsg._InternalParse(buffer, pos, new_pos) != new_pos:
  File "/usr/local/lib/python3.6/dist-packages/google/protobuf/internal/python_message.py", line 1199, in InternalParse
    buffer, new_pos, wire_type)  # pylint: disable=protected-access
  File "/usr/local/lib/python3.6/dist-packages/google/protobuf/internal/decoder.py", line 989, in _DecodeUnknownField
    (data, pos) = _DecodeUnknownFieldSet(buffer, pos)
  File "/usr/local/lib/python3.6/dist-packages/google/protobuf/internal/decoder.py", line 968, in _DecodeUnknownFieldSet
    (data, pos) = _DecodeUnknownField(buffer, pos, wire_type)
  File "/usr/local/lib/python3.6/dist-packages/google/protobuf/internal/decoder.py", line 993, in _DecodeUnknownField
    raise _DecodeError('Wrong wire type in tag.')
google.protobuf.message.DecodeError: Wrong wire type in tag.

or

 python /usr/lib/python2.7/dist-packages/uff/bin/convert_to_uff.py          model-555139022817591296_tf-saved-model_2020-09-08T00_12_38.738Z_saved_model.pb -O NMS          -p /usr/src/tensorrt/samples/sampleUffSSD/config.py          -o sample_ssd_relu6.uff
Traceback (most recent call last):
  File "/usr/lib/python2.7/dist-packages/uff/bin/convert_to_uff.py", line 65, in <module>
    import uff
  File "/usr/lib/python2.7/dist-packages/uff/bin/../../uff/__init__.py", line 49, in <module>
    from uff.converters.tensorflow.conversion_helpers import from_tensorflow  # noqa
  File "/usr/lib/python2.7/dist-packages/uff/bin/../../uff/converters/tensorflow/conversion_helpers.py", line 59, in <module>
    from .converter_functions import *  # noqa
  File "/usr/lib/python2.7/dist-packages/uff/bin/../../uff/converters/tensorflow/converter_functions.py", line 59, in <module>
    from uff.converters.tensorflow.converter import TensorFlowToUFFConverter as tf2uff
  File "/usr/lib/python2.7/dist-packages/uff/bin/../../uff/converters/tensorflow/converter.py", line 60, in <module>
    from tensorflow.compat.v1 import AttrValue
ImportError: No module named tensorflow.compat.v1

On the other hand, while the model above fails,
with Intel scenario retrained inception input the uff file comes up

python3 /usr/lib/python3.6/dist-packages/uff/bin/convert_to_uff.py          retrained_graph.pb -O NMS          -p /usr/src/tensorrt/samples/sampleUffSSD/config.py          -o sample_ssd_relu6.uff
2020-09-21 06:07:44.955945: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.2
WARNING:tensorflow:Deprecation warnings have been disabled. Set TF_ENABLE_DEPRECATION_WARNINGS=1 to re-enable them.
Loading retrained_graph.pb
WARNING:tensorflow:From /usr/lib/python3.6/dist-packages/uff/bin/../../uff/converters/tensorflow/conversion_helpers.py:274: The name tf.gfile.GFile is deprecated. Please use tf.io.gfile.GFile instead.

NOTE: UFF has been tested with TensorFlow 1.15.0.
WARNING: The version of TensorFlow installed on this system is not guaranteed to work with UFF.
UFF Version 0.6.9
=== Automatically deduced input nodes ===
[name: "Input"
op: "Placeholder"
attr {
  key: "dtype"
  value {
    type: DT_FLOAT
  }
}
attr {
  key: "shape"
  value {
    shape {
      dim {
        size: 1
      }
      dim {
        size: 3
      }
      dim {
        size: 300
      }
      dim {
        size: 300
      }
    }
  }
}
]
=========================================

Using output node NMS
Converting to UFF graph
Warning: No conversion function registered for layer: NMS_TRT yet.
Converting NMS as custom op: NMS_TRT
WARNING:tensorflow:From /usr/lib/python3.6/dist-packages/uff/bin/../../uff/converters/tensorflow/converter.py:226: The name tf.AttrValue is deprecated. Please use tf.compat.v1.AttrValue instead.

DEBUG [/usr/lib/python3.6/dist-packages/uff/bin/../../uff/converters/tensorflow/converter.py:143] Marking ['NMS'] as outputs
No. nodes: 2
UFF Output written to sample_ssd_relu6.uff

trying to run

 gst-launch-1.0 filesrc location=../../samples/streams/sample_1080p_h264.mp4 ! \
>         decodebin ! m.sink_0 nvstreammux name=m batch-size=1 width=1280 \
>         height=720 ! nvinfer config-file-path= config_infer_primary_ssd.txt ! \
>         nvvideoconvert ! nvdsosd ! nvegltransform ! nveglglessink
Warn: 'threshold' parameter has been deprecated. Use 'pre-cluster-threshold' instead.
Setting pipeline to PAUSED ...

Using winsys: x11 
ERROR: Deserialize engine failed because file path: /opt/nvidia/deepstream/deepstream-5.0/sources/objectDetector_SSD/sample_ssd_relu6.uff_b1_gpu0_fp32.engine open error
0:00:01.556749523 19613   0x55aa8638c0 WARN                 nvinfer gstnvinfer.cpp:616:gst_nvinfer_logger:<nvinfer0> NvDsInferContext[UID 1]: Warning from NvDsInferContextImpl::deserializeEngineAndBackend() <nvdsinfer_context_impl.cpp:1690> [UID = 1]: deserialize engine from file :/opt/nvidia/deepstream/deepstream-5.0/sources/objectDetector_SSD/sample_ssd_relu6.uff_b1_gpu0_fp32.engine failed
0:00:01.557178850 19613   0x55aa8638c0 WARN                 nvinfer gstnvinfer.cpp:616:gst_nvinfer_logger:<nvinfer0> NvDsInferContext[UID 1]: Warning from NvDsInferContextImpl::generateBackendContext() <nvdsinfer_context_impl.cpp:1797> [UID = 1]: deserialize backend context from engine from file :/opt/nvidia/deepstream/deepstream-5.0/sources/objectDetector_SSD/sample_ssd_relu6.uff_b1_gpu0_fp32.engine failed, try rebuild
0:00:01.557236900 19613   0x55aa8638c0 INFO                 nvinfer gstnvinfer.cpp:619:gst_nvinfer_logger:<nvinfer0> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:1715> [UID = 1]: Trying to create engine from model files
#assertionnmsPlugin.cpp,82
Aborted (core dumped)
 cp retrained_labels.txt /opt/nvidia/deepstream/deepstream-5.0/sources/objectDetector_SSD/ssd_coco_labels.txt
 deepstream-app -c deepstream_app_config_ssd.txt
Warn: 'threshold' parameter has been deprecated. Use 'pre-cluster-threshold' instead.

Using winsys: x11 
ERROR: Deserialize engine failed because file path: /opt/nvidia/deepstream/deepstream-5.0/sources/objectDetector_SSD/sample_ssd_relu6.uff_b1_gpu0_fp32.engine open error
0:00:01.224137066 19805     0x3d17b260 WARN                 nvinfer gstnvinfer.cpp:616:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Warning from NvDsInferContextImpl::deserializeEngineAndBackend() <nvdsinfer_context_impl.cpp:1690> [UID = 1]: deserialize engine from file :/opt/nvidia/deepstream/deepstream-5.0/sources/objectDetector_SSD/sample_ssd_relu6.uff_b1_gpu0_fp32.engine failed
0:00:01.224347924 19805     0x3d17b260 WARN                 nvinfer gstnvinfer.cpp:616:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Warning from NvDsInferContextImpl::generateBackendContext() <nvdsinfer_context_impl.cpp:1797> [UID = 1]: deserialize backend context from engine from file :/opt/nvidia/deepstream/deepstream-5.0/sources/objectDetector_SSD/sample_ssd_relu6.uff_b1_gpu0_fp32.engine failed, try rebuild
0:00:01.224382102 19805     0x3d17b260 INFO                 nvinfer gstnvinfer.cpp:619:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:1715> [UID = 1]: Trying to create engine from model files
#assertionnmsPlugin.cpp,82
Aborted (core dumped)

Please make sure uff model generated correctly, or the sample running with the model generated will fail.
Follow the README and use the version specified, you will generate the uff model.