Run model engine on Jetson NX win Python

Hello! I am beginner user on Jetson NX (JetsonPack 4.5).

I have several (maybe simple) questions which I cannot find on Internet.

I have 2 classification CNN models (about 10 M parameters) are written on Python + TF 2.0.

I would like to use them on Jetson NX.

  1. When I converted my models using TF converter these models work ok, but the loading process on Jetson NX is veru slow! It is about 400 sec! Is it ok for this device?? In this case, no any opportunity to fast reload model while in work process… It is very bad.

Maybe some one know and tell me what must I do else?

  1. Another way that I found on Internet is to convert TF model to ONNX format (tf2onn packagex), then we can save engine and usi it by pycuda or onnx-tensorrt.

I spent a lot of time and installed tf2onnx. I understood how to convert TF model to onnx format and inference it using onnxruntime.

But the speed of loading model still slow. In other Internet place I found how to create and save rt engine. It loads very quick. But I do not know absolutely how to infer this engine.

I found that we can use pycuda. But it does not install on JetSon 4.5. A lot of erros. I used this command:

sudo pip3 install --global-option=build_ext --global-option="-I/usr/local/cuda/include" --global-option="-L/usr/local/cuda/lib64" pycuda

I found that we can use onnx-tensorrt. But it requires more new cmake utility (I updated it) and then the install process of onnx-tensorrt gves me a lot of erros, too.

Why is it so hard to use this device in production? Why is it absent to simple instruction how to cnvert the model in RTT format and inference it?

Please, help me and tell me what can I do for solving this problems?

Hi,

1. Have you maximized the device performance first?

$ sudo nvpmodel -m 0
$ sudo jetson_clocks

More, could you check the detailed initialization time as well as the inference time?

It’s expected that initialization tends to be slow since TensorFlow is a heavy framework.
But the inference time should be fast.

2. pyCUDA can be installed via the following command:

$ pip3 install numpy pycuda --user

Then you can find several examples for deploying a TensorRT engine below:

Thanks.

Thank you for your response!

  1. The command sudo nvpmodel -m 0 turn on the mode with 15W2CORE. I use 15W6CORE. Wether I understand correct that in this mode we will get the faster estimations for TensorRT models?

  2. This command (pip3 install numpy pycuda --user) gives me the next errors:
    pip3 install pycuda --user
    Collecting pycuda
    Using cached pycuda-2021.1.tar.gz (1.7 MB)
    Installing build dependencies … error
    ERROR: Command errored out with exit status 1:
    command: /usr/bin/python3 /usr/local/lib/python3.6/dist-packages/pip install --ignore-installed --no-user --prefix /tmp/pip-build-env-yivouqca/overlay --no-warn-script-location --no-binary :none: --only-binary :none: -i Simple index – setuptools wheel ‘numpy; python_version >= ‘"’"‘3.10’"’"’’ ‘numpy==1.19.5; python_version >= ‘"’"‘3.8’"’"’ and python_version < ‘"’"‘3.10’"’"’’ ‘numpy==1.15.4; python_version >= ‘"’"‘3.7’"’"’ and python_version < ‘"’"‘3.8’"’"’’ ‘numpy==1.12.1; python_version < ‘"’"‘3.7’"’"’’
    cwd: None
    Complete output (2488 lines):
    Ignoring numpy: markers ‘python_version >= “3.10”’ don’t match your environment
    Ignoring numpy: markers ‘python_version >= “3.8” and python_version < “3.10”’ don’t match your environment
    Ignoring numpy: markers ‘python_version >= “3.7” and python_version < “3.8”’ don’t match your environment
    Collecting setuptools
    Using cached setuptools-57.2.0-py3-none-any.whl (818 kB)
    Collecting wheel
    Using cached wheel-0.36.2-py2.py3-none-any.whl (35 kB)
    Collecting numpy==1.12.1
    Using cached numpy-1.12.1.zip (4.8 MB)
    Building wheels for collected packages: numpy
    Building wheel for numpy (setup.py): started
    Building wheel for numpy (setup.py): still running…
    Building wheel for numpy (setup.py): still running…
    Building wheel for numpy (setup.py): still running…
    Building wheel for numpy (setup.py): finished with status ‘error’
    ERROR: Command errored out with exit status 1:
    command: /usr/bin/python3 -u -c ‘import io, os, sys, setuptools, tokenize; sys.argv[0] = ‘"’"’/tmp/pip-install-gt4iqm1r/numpy_4ffb26f2b4e3445eb0bec93cb49f69dc/setup.py’"’"’; file=’"’"’/tmp/pip-install-gt4iqm1r/numpy_4ffb26f2b4e3445eb0bec93cb49f69dc/setup.py’"’"’;f = getattr(tokenize, ‘"’"‘open’"’"’, open)(file) if os.path.exists(file) else io.StringIO(’"’"‘from setuptools import setup; setup()’"’"’);code = f.read().replace(’"’"’\r\n’"’"’, ‘"’"’\n’"’"’);f.close();exec(compile(code, file, ‘"’"‘exec’"’"’))’ bdist_wheel -d /tmp/pip-wheel-ig5n9soi
    cwd: /tmp/pip-install-gt4iqm1r/numpy_4ffb26f2b4e3445eb0bec93cb49f69dc/
    Complete output (2152 lines):
    Running from numpy source directory.
    blas_opt_info:
    blas_mkl_info:
    libraries mkl_rt not found in [’/usr/local/lib’, ‘/usr/lib’, ‘/usr/lib/aarch64-linux-gnu’]
    NOT AVAILABLE

    blis_info:

…etc…


WARNING: Discarding https://files.pythonhosted.org/packages/5a/56/4682a5118a234d15aa1c8768a528aac4858c7b04d2674e18d586d3dfda04/pycuda-2021.1.tar.gz#sha256=ab87312d0fc349d9c17294a087bb9615cffcf966ad7b115f5b051008a48dd6ed (from Links for pycuda) (requires-python:~=3.6). Command errored out with exit status 1: /usr/bin/python3 /usr/local/lib/python3.6/dist-packages/pip install --ignore-installed --no-user --prefix /tmp/pip-build-env-yivouqca/overlay --no-warn-script-location --no-binary :none: --only-binary :none: -i Simple index – setuptools wheel ‘numpy; python_version >= ‘"’"‘3.10’"’"’’ ‘numpy==1.19.5; python_version >= ‘"’"‘3.8’"’"’ and python_version < ‘"’"‘3.10’"’"’’ ‘numpy==1.15.4; python_version >= ‘"’"‘3.7’"’"’ and python_version < ‘"’"‘3.8’"’"’’ ‘numpy==1.12.1; python_version < ‘"’"‘3.7’"’"’’ Check the logs for full command output.
Using cached pycuda-2020.1-cp36-cp36m-linux_aarch64.whl
Requirement already satisfied: decorator>=3.2.0 in /usr/lib/python3/dist-packages (from pycuda) (4.1.2)
Requirement already satisfied: mako in /usr/lib/python3/dist-packages (from pycuda) (1.0.7)
Requirement already satisfied: pytools>=2011.2 in /usr/local/lib/python3.6/dist-packages (from pycuda) (2021.2.7)
Requirement already satisfied: appdirs>=1.4.0 in /usr/local/lib/python3.6/dist-packages (from pycuda) (1.4.4)
Requirement already satisfied: numpy>=1.6.0 in /usr/local/lib/python3.6/dist-packages (from pytools>=2011.2->pycuda) (1.19.4)
Requirement already satisfied: dataclasses>=0.7 in /usr/local/lib/python3.6/dist-packages (from pytools>=2011.2->pycuda) (0.8)
Installing collected packages: pycuda
Successfully installed pycuda-2020.1

Now pycuda appeared in list of packages, but I surprised the huge number of errors while install…

Hi,

1. You can find the nvpmodel details below:
https://docs.nvidia.com/jetson/l4t/index.html#page/Tegra%20Linux%20Driver%20Package%20Development%20Guide/power_management_jetson_xavier.html#wwpID0E0VO0HA

Although mode 0 only enables two CPUs, the clocks rate is much higher.
Since TensorRT mainly uses GPU for inference, the boost in CPU will help to launch kernel faster.

2. We don’t meet a similar error when installing pyCUDA.
It looks like the error is mainly related to NumPy.
Is there anything weird when using the pyCUDA?

Thanks.

Is there anything weird when using the pyCUDA?

In fact, the engines work quick and results look like correct results. But the reload model is more slow process then inference.
For example, I take batch from 8 images and get the inference time about 0.10 sec. for my CNN (it is abput 10 M parameters, 300x300 in 16bit). But if I want to reload another model it can take abput 0.25-0.30 sec. Is it ok? Maybe is theere a way to improve the loading tiime?

Hi,

May I know which frameworks do you use for inference currently? TensorFlow or TensorRT?

If TensorRT is used, have you tried to serialize the engine or convert it every time from an ONNX model?
Serialization can save you some conversion time which is triggered in the initial.

Thanks.