Given there is .engine file & h5, how to incorporate it into Deepstream?

Could you point out where to start? How to narrow down the search?
from documentation

You may specify either a TensorRT engine file or a .etlt model in the DeepStream configuration file.

Which exactly configuration file will need to be edited? How to determine?
Upd: Seems the first step to try will be

 You must specify the applicable configuration parameters in the [property] group of the nvinfer configuration file (for example, config_infer_primary.txt).

to the config_infer_primary.txt will go parameter

model-engine-file, if already generated

what other parameters will need to go to there for .engine file?

from https://docs.nvidia.com/metropolis/deepstream/dev-guide/index.html#page/DeepStream_Development_Guide/deepstream_custom_model.html#
It seems that additional documentation can be found at “the header file nvdsinfer_custom_impl.h.”

Which exactly parameters from the list must I figure out in order to load the model?

The configuration parameters that you must specify include:

•model-file (Caffe model)

•proto-file (Caffe model)

•uff-file (UFF models)

•onnx-file (ONNX models)

•model-engine-file, if already generated

•int8-calib-file for INT8 mode

•mean-file, if required

•offsets, if required

•maintain-aspect-ratio, if required

•parse-bbox-func-name (detectors only)

•parse-classifier-func-name (classifiers only)

•custom-lib-path

•output-blob-names (Caffe and UFF models)

•network-type

•model-color-format

•process-mode

•engine-create-func-name

•infer-dims (UFF models)

•uff-input-order (UFF models)

Hi guys,
given there are h5 & .engine - two files,
Could you provide exact instructions for running DeepStream, please?
with the former? latter? both? neither of the two?

Hi @Andrey1984,
These are model file options, you could just specify one of them to tell DS the model.

Since you want to use TLT mode, I think you could start with DeepStream TLT samples - NVIDIA DeepStream SDK Developer Guide — DeepStream 6.3 Release documentation

Specify where exactly? In the file named config_infer_primary.txt ? somewhere elsewhere? together with which parameters? or just the model file only at this point to specify?

Hi @mchi,
Thank you for your response!

What made you conclude that I want to use TLT mode? Could you extend if the TLT mode is the only that will work given I gust got h5 &.engine [engine was made from h5 via chain of conversions] files that I have to import in Deepstream?
Are there any exact instruction steps? or exact list of parameters that will need to be modified in exact configuration file in order to import .engine or derivative of h5?.
The complication is that when I tried the investigation on how to import .engine I was not able to locate any exact list of parameters that I need to procure and specify in an exact manner to get a model provided by third party to load into DeepStream. Then I got h5. From h5 I can with converter also get onnx. It my understanding uff is better than onnx, but it is also a different scenario?
could you point out how to incorporatew existent .engine file? Which parameters but the path to the model file will need to be specified? what is the simplest way to getit from h5? from .engine? Which is the easiest way to get it running? Will you be able to share more exact details regarding which exactly arguments will need to be defined but for the model name, please?
Moreover, it is not clear how the shared TLT url is related to .engine, or Keras h5, as they seem the only scenarios not addressed in the tutorial [tlt url shared above]. Could you extend how it could be applied, please?
I did not previously import any model into DS, so it is my first attempt to do so. So exact details are greatly appreciated.


Attempt 1:

objective “specify •model-engine-file, already generated”
complications: specify where? specify in which exactly manner?
assumptoins: specify to config_infer_primary.txt
complications: next step?

Practical implementation:

  1.  docker run -it --rm --net=host --runtime nvidia  -e DISPLAY=$DISPLAY -w /opt/nvidia/deepstream/deepstream-5.0 -v /tmp/.X11-unix/:/tmp/.X11-unix -v /home/nvidia/import:/import nvcr.io/nvidia/deepstream-l4t:5.0-dp-20.04-samples
     root@nx:/opt/nvidia/deepstream/deepstream-5.0#
     apt update -y && apt install nano -y && apt install mlocate -y && updatedb
      locate config_infer_primary.txt
    /opt/nvidia/deepstream/deepstream-5.0/samples/configs/deepstream-app/config_infer_primary.txt`
    
    
2.

only engine file is needed. h5 file is not needed for deepstream.

about the execution command, please refer to the README - /opt/nvidia/deepstream/deepstream-5.0/samples/configs/tlt_pretrained_models/README

so I am only adding the emgine like that?

nano /opt/nvidia/deepstream/deepstream-5.0/samples/configs/deepstream-app/config_infer_primary.txt

then finding the line ;
which is the line number?
which is the exact line?
the first section looks like

[property]
gpu-id=0
net-scale-factor=0.0039215697906911373
model-file=../../models/Primary_Detector/resnet10.caffemodel
proto-file=../../models/Primary_Detector/resnet10.prototxt
model-engine-file=../../models/Primary_Detector/resnet10.caffemodel_b30_gpu0_in$
labelfile-path=../../models/Primary_Detector/labels.txt
int8-calib-file=../../models/Primary_Detector/cal_trt.bin
batch-size=30
process-mode=1
model-color-format=0
## 0=FP32, 1=INT8, 2=FP16 mode
network-mode=1
num-detected-classes=4
interval=0
gie-unique-id=1
output-blob-names=conv2d_bbox;conv2d_cov/Sigmoid
force-implicit-batch-dim=1

Do I remove all lines from the section above? some lines? then add one simple line?

model-engine-file=../../models/Primary_Detector/resnet10.caffemodel_b30_gpu0_int8.engine

that is the line that I rewrite as:

model-engine-file=/import/model_frozen_fp16.engine

so I rewrite one line , but what do I do with other lines that are many?
What will be the redundant set of components/ parameters that will need to be imported/defined in the configuration file? will just standalone engine path work for the whole thing to work then ? or a bulk of other parameters will need to be specified?

As I mentioned above, for model, only one of below five is needed, that means, if you specify model-engine-file, the other four are not needed.
The other configs,e.g. process-mode, num-detected-classes, etc, are still needed according to the property of your model.

This is an example

Thank you for your response!

That is the complication exactly!
As the model is provided by a third party as .engine file [ also as h5] no other inputs were provided.
How do I know which exactly parameters I need to retrieve from the supplier in order to define the engine file with ALL parameters in redundant manner?

If you don’t know what your engine does, e,g, num-detected-classes, which info does not include in engine file.
I don’t know how we can help you

you could point out to full redundant list of properties required for Keras model as to a reference, probably?
Do you have any Keras reference with full list of parameters?
so that I wil try to retriev from the supplier the model properties according to the list?
will that work? likely? unlikely? highly unlikely? I anticipate that DeepStream is agnostic to model parameters as long as they are supplied, but how to determine which of them exacly are need to be supplied?

If I have python wrapper for running the .engine model file with TensorRT.
Could these missed parameters be retrieved from there? or they are not present for trt runtime execution? but required by Deepstream? why DeepStream would require parameters that TensorRT runtime doesn’t require?
@mchi ?

many informantion are about post-processing if you provide TRT engine to DeepStream.
Now sure what post-processing your model requires, so not sure if DeepStream supports it already.

Can you provide more info about your model, otherwise, it’s hard to say anything likely, unlikely… etc ?
Or, DS doc has explained the meaning and usage of the nvinfer parameters, you could go through by cross-checking your models.

one parameter has been retrieved from the model supplier:
num_classes is like =1
it seems there are still many parameters to find out?

yes, you need to get them outside of the model itself

@mchi
Will it be easier to incorporate into DeeppStream the following solution? Given the sources are provided as is in the documentation below? Would you be able to help with such implementation?

I think it’s feasible. I think the steps should be:

  1. convert the pytorch model to onnx
  2. configure the gie config with the onnx, and implement the detection post processing
  3. inference with video or image as input for DeepStream

Attempt 1.
Step 1.
Downloading the dataset.

wget https://github.com/javathunderman/retinopathy-dataset/archive/master.zip

Downloading the frozen graph:

wget https://storage.googleapis.com/download.tensorflow.org/models/inception_dec_2015.zip
 unzip inception_dec_2015.zip 
Archive:  inception_dec_2015.zip
  inflating: imagenet_comp_graph_label_strings.txt  
  inflating: LICENSE                 
  inflating: tensorflow_inception_graph.pb  

using environment of DeepStream 5

c4e41ec4dce6        nvcr.io/nvidia/deepstream-l4t:5.0-dp-20.04-samples 
docker start c4e41ec4dce6
docker exec -it c4e41ec4dce6 bash
:/opt/nvidia/deepstream/deepstream-5.0# 

:/import# python3 -m tf2onnx.convert --input tensorflow_inception_graph.pb  --output tensorflow_inception_graph.onnx
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/pywrap_tensorflow.py", line 58, in <module>
    from tensorflow.python.pywrap_tensorflow_internal import *
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 28, in <module>
    _pywrap_tensorflow_internal = swig_import_helper()
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 24, in swig_import_helper
    _mod = imp.load_module('_pywrap_tensorflow_internal', fp, pathname, description)
  File "/usr/lib/python3.6/imp.py", line 243, in load_module
    return load_dynamic(name, filename, file)
  File "/usr/lib/python3.6/imp.py", line 343, in load_dynamic
    return _load(spec)
ImportError: libcublas.so.10.0: cannot open shared object file: No such file or directory

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.6/runpy.py", line 183, in _run_module_as_main
    mod_name, mod_spec, code = _get_module_details(mod_name, _Error)
  File "/usr/lib/python3.6/runpy.py", line 109, in _get_module_details
    __import__(pkg_name)
  File "/usr/local/lib/python3.6/dist-packages/tf2onnx/__init__.py", line 14, in <module>
    from . import verbose_logging as logging
  File "/usr/local/lib/python3.6/dist-packages/tf2onnx/verbose_logging.py", line 14, in <module>
    import tensorflow as tf
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/__init__.py", line 24, in <module>
    from tensorflow.python import pywrap_tensorflow  # pylint: disable=unused-import
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/__init__.py", line 49, in <module>
    from tensorflow.python import pywrap_tensorflow
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/pywrap_tensorflow.py", line 74, in <module>
    raise ImportError(msg)
ImportError: Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/pywrap_tensorflow.py", line 58, in <module>
    from tensorflow.python.pywrap_tensorflow_internal import *
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 28, in <module>
    _pywrap_tensorflow_internal = swig_import_helper()
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 24, in swig_import_helper
    _mod = imp.load_module('_pywrap_tensorflow_internal', fp, pathname, description)
  File "/usr/lib/python3.6/imp.py", line 243, in load_module
    return load_dynamic(name, filename, file)
  File "/usr/lib/python3.6/imp.py", line 343, in load_dynamic
    return _load(spec)
ImportError: libcublas.so.10.0: cannot open shared object file: No such file or directory


Failed to load the native TensorFlow runtime.

See https://www.tensorflow.org/install/errors

for some common reasons and solutions.  Include the entire stack trace
above this error message when asking for help.

further it got into cusolver issue that belong to read only system thus can not be adjusted as in the example above with symlink.
Trying different container from NGX ML-tensorflow

/import# python3 -m tf2onnx.convert --input tensorflow_inception_graph.pb  --output tensorflow_inception_graph.onnx
2020-09-03 10:48:07.690127: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.2
2020-09-03 10:48:15.147892: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcuda.so.1
2020-09-03 10:48:15.153429: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:948] ARM64 does not support NUMA - returning NUMA node zero
2020-09-03 10:48:15.153618: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1561] Found device 0 with properties: 
pciBusID: 0000:00:00.0 name: Xavier computeCapability: 7.2
coreClock: 1.109GHz coreCount: 6 deviceMemorySize: 7.59GiB deviceMemoryBandwidth: 66.10GiB/s
2020-09-03 10:48:15.153712: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.2
2020-09-03 10:48:15.224170: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcublas.so.10
2020-09-03 10:48:15.305383: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcufft.so.10
2020-09-03 10:48:15.417929: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcurand.so.10
2020-09-03 10:48:15.554397: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcusolver.so.10
2020-09-03 10:48:15.631347: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcusparse.so.10
2020-09-03 10:48:15.632692: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudnn.so.8
2020-09-03 10:48:15.633105: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:948] ARM64 does not support NUMA - returning NUMA node zero
2020-09-03 10:48:15.633556: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:948] ARM64 does not support NUMA - returning NUMA node zero
2020-09-03 10:48:15.633783: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1703] Adding visible gpu devices: 0
2020-09-03 10:48:15.669545: W tensorflow/core/platform/profile_utils/cpu_utils.cc:106] Failed to find bogomips or clock in /proc/cpuinfo; cannot determine CPU frequency
2020-09-03 10:48:15.670181: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x4f27350 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-09-03 10:48:15.670261: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
2020-09-03 10:48:15.835514: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:948] ARM64 does not support NUMA - returning NUMA node zero
2020-09-03 10:48:15.836804: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x5093da0 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2020-09-03 10:48:15.836950: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Xavier, Compute Capability 7.2
2020-09-03 10:48:15.863081: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:948] ARM64 does not support NUMA - returning NUMA node zero
2020-09-03 10:48:15.863471: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1561] Found device 0 with properties: 
pciBusID: 0000:00:00.0 name: Xavier computeCapability: 7.2
coreClock: 1.109GHz coreCount: 6 deviceMemorySize: 7.59GiB deviceMemoryBandwidth: 66.10GiB/s
2020-09-03 10:48:15.863658: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.2
2020-09-03 10:48:15.864142: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcublas.so.10
2020-09-03 10:48:15.864439: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcufft.so.10
2020-09-03 10:48:15.864609: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcurand.so.10
2020-09-03 10:48:15.864705: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcusolver.so.10
2020-09-03 10:48:15.864817: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcusparse.so.10
2020-09-03 10:48:15.864893: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudnn.so.8
2020-09-03 10:48:15.865351: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:948] ARM64 does not support NUMA - returning NUMA node zero
2020-09-03 10:48:15.865715: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:948] ARM64 does not support NUMA - returning NUMA node zero
2020-09-03 10:48:15.865973: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1703] Adding visible gpu devices: 0
2020-09-03 10:48:15.866410: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.2
2020-09-03 10:48:22.266070: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-09-03 10:48:22.266233: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1108]      0 
2020-09-03 10:48:22.266291: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1121] 0:   N 
2020-09-03 10:48:22.266850: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:948] ARM64 does not support NUMA - returning NUMA node zero
2020-09-03 10:48:22.267934: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:948] ARM64 does not support NUMA - returning NUMA node zero
2020-09-03 10:48:22.268273: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1247] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 2525 MB memory) -> physical GPU (device: 0, name: Xavier, pci bus id: 0000:00:00.0, compute capability: 7.2)
2020-09-03 10:48:23.903300: W tensorflow/core/framework/op_def_util.cc:371] Op BatchNormWithGlobalNormalization is deprecated. It will cease to work in GraphDef version 9. Use tf.nn.batch_normalization().
Traceback (most recent call last):
  File "/usr/lib/python3.6/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/usr/lib/python3.6/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/usr/local/lib/python3.6/dist-packages/tf2onnx/convert.py", line 171, in <module>
    main()
  File "/usr/local/lib/python3.6/dist-packages/tf2onnx/convert.py", line 125, in main
    graph_def, inputs, outputs = tf_loader.from_graphdef(args.graphdef, args.inputs, args.outputs)
  File "/usr/local/lib/python3.6/dist-packages/tf2onnx/tf_loader.py", line 150, in from_graphdef
    frozen_graph = freeze_session(sess, input_names=input_names, output_names=output_names)
  File "/usr/local/lib/python3.6/dist-packages/tf2onnx/tf_loader.py", line 113, in freeze_session
    output_node_names = [i.split(':')[:-1][0] for i in output_names]
TypeError: 'NoneType' object is not iterable

Probably I should try with tensorflow 1.4?
Obviously first attemppt to us the instruction to install specific version 1.4.0 of tensoflow fails using

 pip3 install --extra-index-url https://developer.download.nvidia.com/compute/redist/jp/v44 tensorflow==1.4.0+nv20.08         
Looking in indexes: https://pypi.org/simple, https://developer.download.nvidia.com/compute/redist/jp/v44
ERROR: Could not find a version that satisfies the requirement tensorflow==1.4.0+nv20.08 (from versions: 1.15.2+nv20.4, 1.15.2+nv20.6, 1.15.3+nv20.7, 1.15.3+nv20.8, 2.1.0+nv20.4, 2.2.0+nv20.6, 2.2.0+nv20.7, 2.2.0+nv20.8)
ERROR: No matching distribution found for tensorflow==1.4.0+nv20.08

Since it doesn’t appear possible to get the version 1.4.0, I have to use

$ sudo pip3 install --pre --extra-index-url https://developer.download.nvidia.com/compute/redist/jp/v44 ‘tensorflow<2’

Hi @Andrey1984,
This is a new topic thay has nothing to do with original issue, please file a new topic.
And, if the original quetions you asked have been addressed, could you mark it closed?

BTW, for the onnx conversion, only pb is not enough