Create inference graph failed on Agx Xavier

HuiW · March 3, 2020, 10:33am

We like to inference our tensorflow models which are built on Keras on Agx Xavier.
And got error while transfered a h5 model to inference graph.

Here is the error message.
Traceback (most recent call last):
File “trt.py”, line 106, in
print (infer(test, tf_sess, inp, out) )
File “trt.py”, line 88, in infer
y_batch = sess.run(output_tensor, feed_dict={input_tensor:x_batch})
File “/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py”, line 956, in run
run_metadata_ptr)
File “/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py”, line 1156, in _run
(np_val.shape, subfeed_t.name, str(subfeed_t.get_shape())))
ValueError: Cannot feed value of shape (1, 28, 28, 1) for Tensor ‘conv2d_1/kernel:0’, which has shape ‘(3, 3, 1, 32)’

To use tensorflow2.0, we installed jetpack4.3 and tensowflow2.0.

Thank you for any advice,
tf_trt_transfer_fail_msg.txt (15.2 KB)

AastaLLL · March 4, 2020, 2:40am

Hi,

The error indicates the input size is incorrect.
If this model can be executed with Keras, this error looks abnormal to me.

Would you mind share your trt.py for us debugging?

Thanks.

HuiW · March 4, 2020, 3:21am

Hi AastaLLL,

Thank you for your prompt support.
Attached please find trt.py

Thank you,
trt.py.txt (3.6 KB)

HuiW · March 10, 2020, 4:00am

Hi,

Do you need further information?
Is there any suggestion?

Thank you,

AastaLLL · March 12, 2020, 5:10am

Hi,

Sorry for keeping you waiting.
Could you also share cnn.h5 file for us checking?

Thanks.

HuiW · March 12, 2020, 7:26am

Hi AastaLLL,

Attached please find the files.

Thank you,

AastaLLL · March 13, 2020, 6:30am

Hi,

We try to dump the network layer, input and output connection of cnn.h5:
Some of layers are not connected, and one of them is the error node.

Please check if there is any issue when converting the Keras network into TensorFlow graph first.

[b](<tf.Tensor 'conv2d_1/kernel:0' shape=(3, 3, 1, 32) dtype=float32>,)
(<tf.Tensor 'conv2d_1/bias:0' shape=(32,) dtype=float32>,)
(<tf.Tensor 'conv2d_2/kernel:0' shape=(3, 3, 32, 64) dtype=float32>,)
(<tf.Tensor 'conv2d_2/bias:0' shape=(64,) dtype=float32>,)
(<tf.Tensor 'dense_1/kernel:0' shape=(9216, 128) dtype=float32>,)
(<tf.Tensor 'dense_1/bias:0' shape=(128,) dtype=float32>,)
(<tf.Tensor 'dense_2/kernel:0' shape=(128, 10) dtype=float32>,)
(<tf.Tensor 'dense_2/bias:0' shape=(10,) dtype=float32>,)
(<tf.Tensor 'image_tensor_x:0' shape=(?, 28, 28, 1) dtype=float32>,)[/b]
(<tf.Tensor 'sequential/flatten_1/Reshape/shape:0' shape=(2,) dtype=int32>,)
(<tf.Tensor 'sequential/conv2d_1/Conv2D-0-PermConstNHWCToNCHW-LayoutOptimizer:0' shape=(4,) dtype=int32>,)
(<tf.Tensor 'sequential/max_pooling2d_1/MaxPool-0-0-PermConstNCHWToNHWC-LayoutOptimizer:0' shape=(4,) dtype=int32>,)
(<tf.Tensor 'sequential/conv2d_1/Conv2D-0-TransposeNHWCToNCHW-LayoutOptimizer:0' shape=(?, 1, 28, 28) dtype=float32>,)
    Input: Tensor("image_tensor_x:0", shape=(?, 28, 28, 1), dtype=float32)
    Input: Tensor("sequential/conv2d_1/Conv2D-0-PermConstNHWCToNCHW-LayoutOptimizer:0", shape=(4,), dtype=int32, device=/job:localhost/replica:0/task:0/device:GPU:0)
(<tf.Tensor 'sequential/conv2d_1/Conv2D:0' shape=(?, 32, 26, 26) dtype=float32>,)
    Input: Tensor("sequential/conv2d_1/Conv2D-0-TransposeNHWCToNCHW-LayoutOptimizer:0", shape=(?, 1, 28, 28), dtype=float32, device=/job:localhost/replica:0/task:0/device:GPU:0)
    Input: Tensor("conv2d_1/kernel:0", shape=(3, 3, 1, 32), dtype=float32)
(<tf.Tensor 'sequential/conv2d_1/BiasAdd:0' shape=(?, 32, 26, 26) dtype=float32>,)
    Input: Tensor("sequential/conv2d_1/Conv2D:0", shape=(?, 32, 26, 26), dtype=float32)
    Input: Tensor("conv2d_1/bias:0", shape=(32,), dtype=float32)
(<tf.Tensor 'sequential/conv2d_1/Relu:0' shape=(?, 32, 26, 26) dtype=float32>,)
    Input: Tensor("sequential/conv2d_1/BiasAdd:0", shape=(?, 32, 26, 26), dtype=float32)
(<tf.Tensor 'sequential/conv2d_2/Conv2D:0' shape=(?, 64, 24, 24) dtype=float32>,)
    Input: Tensor("sequential/conv2d_1/Relu:0", shape=(?, 32, 26, 26), dtype=float32)
    Input: Tensor("conv2d_2/kernel:0", shape=(3, 3, 32, 64), dtype=float32)
(<tf.Tensor 'sequential/conv2d_2/BiasAdd:0' shape=(?, 64, 24, 24) dtype=float32>,)
    Input: Tensor("sequential/conv2d_2/Conv2D:0", shape=(?, 64, 24, 24), dtype=float32)
    Input: Tensor("conv2d_2/bias:0", shape=(64,), dtype=float32)
(<tf.Tensor 'sequential/conv2d_2/Relu:0' shape=(?, 64, 24, 24) dtype=float32>,)
    Input: Tensor("sequential/conv2d_2/BiasAdd:0", shape=(?, 64, 24, 24), dtype=float32)
(<tf.Tensor 'sequential/max_pooling2d_1/MaxPool:0' shape=(?, 64, 12, 12) dtype=float32>,)
    Input: Tensor("sequential/conv2d_2/Relu:0", shape=(?, 64, 24, 24), dtype=float32)
(<tf.Tensor 'sequential/max_pooling2d_1/MaxPool-0-0-TransposeNCHWToNHWC-LayoutOptimizer:0' shape=(?, 12, 12, 64) dtype=float32>,)
    Input: Tensor("sequential/max_pooling2d_1/MaxPool:0", shape=(?, 64, 12, 12), dtype=float32)
    Input: Tensor("sequential/max_pooling2d_1/MaxPool-0-0-PermConstNCHWToNHWC-LayoutOptimizer:0", shape=(4,), dtype=int32, device=/job:localhost/replica:0/task:0/device:GPU:0)
(<tf.Tensor 'sequential/flatten_1/Reshape:0' shape=(?, 9216) dtype=float32>,)
    Input: Tensor("sequential/max_pooling2d_1/MaxPool-0-0-TransposeNCHWToNHWC-LayoutOptimizer:0", shape=(?, 12, 12, 64), dtype=float32, device=/job:localhost/replica:0/task:0/device:GPU:0)
    Input: Tensor("sequential/flatten_1/Reshape/shape:0", shape=(2,), dtype=int32)
(<tf.Tensor 'sequential/dense_1/MatMul:0' shape=(?, 128) dtype=float32>,)
    Input: Tensor("sequential/flatten_1/Reshape:0", shape=(?, 9216), dtype=float32)
    Input: Tensor("dense_1/kernel:0", shape=(9216, 128), dtype=float32)
(<tf.Tensor 'sequential/dense_1/BiasAdd:0' shape=(?, 128) dtype=float32>,)
    Input: Tensor("sequential/dense_1/MatMul:0", shape=(?, 128), dtype=float32)
    Input: Tensor("dense_1/bias:0", shape=(128,), dtype=float32)
(<tf.Tensor 'sequential/dense_1/Relu:0' shape=(?, 128) dtype=float32>,)
    Input: Tensor("sequential/dense_1/BiasAdd:0", shape=(?, 128), dtype=float32)
(<tf.Tensor 'sequential/dense_2/MatMul:0' shape=(?, 10) dtype=float32>,)
    Input: Tensor("sequential/dense_1/Relu:0", shape=(?, 128), dtype=float32)
    Input: Tensor("dense_2/kernel:0", shape=(128, 10), dtype=float32)
(<tf.Tensor 'sequential/dense_2/BiasAdd:0' shape=(?, 10) dtype=float32>,)
    Input: Tensor("sequential/dense_2/MatMul:0", shape=(?, 10), dtype=float32)
    Input: Tensor("dense_2/bias:0", shape=(10,), dtype=float32)
(<tf.Tensor 'sequential/dense_2/Softmax:0' shape=(?, 10) dtype=float32>,)
    Input: Tensor("sequential/dense_2/BiasAdd:0", shape=(?, 10), dtype=float32)

Thanks.

HuiW · March 19, 2020, 6:49am

Hi AastaLLL,

Thank you for your support.
Actually the model we provided is from keras website sample.

Would you address more about " converting the Keras network into TensorFlow graph first?"
We do not know how to do it.

By the way, do you expect a tensowflow model from tensorflow not keras?

For speed requirement, we set up Agx Xavier with tensor flow 2.0.
All we want is to use any tensowflow model with tensor RT on Agx.

After that, we can implement customer’s tensflow model.
If the model we provided does not work, is there any tensowflow2.0 model we can test it?

Thank you for any advice,

HuiW · March 23, 2020, 3:10am

Hi AastaLLL,

We failed to create frozen graphs as tried to build " ```
tf_to_trt_image_classification (ps: itwas built on R28.2.)

I attached a cnn model which is built by tensorflow.
Would you help check if the model could be transfer to tensowRT?

Or is there a reference to build up  'tf_to tft_image_classfication?"
here are the versions I used.
And the error message.
tensorflow_gpu-2.0.0+nv20.1-cp36-cp36m-linux_aarch64.whl

nva@nva-desktop:~/tf_to_trt_image_classification/build$ make
[ 16%] Building NVCC (Device) object examples/classify_image/CMakeFiles/classify_image.dir/classify_image_generated_classify_image.cu.o
/usr/include/opencv4/opencv2/stitching/detail/warpers.hpp(213): warning: overloaded virtual function "cv::detail::PlaneWarper::buildMaps" is only partially overridden in class "cv::detail::AffineWarper"

/usr/include/opencv4/opencv2/stitching/detail/warpers.hpp(213): warning: overloaded virtual function "cv::detail::PlaneWarper::warp" is only partially overridden in class "cv::detail::AffineWarper"

/usr/include/opencv4/opencv2/stitching/detail/blenders.hpp(100): warning: overloaded virtual function "cv::detail::Blender::prepare" is only partially overridden in class "cv::detail::FeatherBlender"

/usr/include/opencv4/opencv2/stitching/detail/blenders.hpp(127): warning: overloaded virtual function "cv::detail::Blender::prepare" is only partially overridden in class "cv::detail::MultiBandBlender"

/home/nva/tf_to_trt_image_classification/examples/classify_image/classify_image.cu(95): error: identifier "CV_LOAD_IMAGE_COLOR" is undefined

1 error detected in the compilation of "/tmp/tmpxft_00002c2f_00000000-6_classify_image.cpp1.ii".
CMake Error at classify_image_generated_classify_image.cu.o.cmake:279 (message):
  Error generating file
  /home/nva/tf_to_trt_image_classification/build/examples/classify_image/CMakeFiles/classify_image.dir//./classify_image_generated_classify_image.cu.o


examples/classify_image/CMakeFiles/classify_image.dir/build.make:484: recipe for target 'examples/classify_image/CMakeFiles/classify_image.dir/classify_image_generated_classify_image.cu.o' failed
make[2]: *** [examples/classify_image/CMakeFiles/classify_image.dir/classify_image_generated_classify_image.cu.o] Error 1
CMakeFiles/Makefile2:103: recipe for target 'examples/classify_image/CMakeFiles/classify_image.dir/all' failed
make[1]: *** [examples/classify_image/CMakeFiles/classify_image.dir/all] Error 2
Makefile:83: recipe for target 'all' failed
make: *** [all] Error 2
nva@nva-desktop:~/tf_to_trt_image_classification/build$ 

Again, all we want is to use any tensowflow model with tensor RT on Agx.
We need to check the performance of running tensflow model on agx is better than Intel x86 platform without gpu card.  
If tensowflow20 is not a proper version to use on agx, would  you advice a version.

Thank you,

AastaLLL · March 23, 2020, 8:08am

Hi,

Sorry for keeping you waiting.

1. The input of TensorRT parser is the frozen .pb file.
Please convert the model into .pb either from TensorFlow or Keras.

2. Please noticed that we add TFv2.0 support from TensorRT v7.0.
And TensorRT v7.0 is not public available for Jetson yet.

Thanks.

HuiW · March 23, 2020, 9:02am

Hi AastaLLL,

Thank you so much for your support.

So AGX Xavier does not support TFv2.0 with Tensor RT now.
Which TF version would you suggest using on AGX Xavier to get high performance now?

Thank you,

AastaLLL · March 23, 2020, 9:26am

Hi,

You can try TensorFlow v1.15, which can be installed via our official package directly:
https://developer.download.nvidia.com/compute/redist/jp/v43/tensorflow-gpu/

Thanks

HuiW · March 23, 2020, 11:08am

Hi AastaLLL,

Thank you for your prompt support.
We will try Tensorflow1.15 first.
By the way, is there any schedule of TensorRT v7,0 to support Jetson?

Thanks,

kayccc · March 23, 2020, 11:28pm

Hi HuiW,

The next JetPack is scheduled to be released in coming weeks, please wait for our announcement.

Thanks

HuiW · March 31, 2020, 6:30am

Hi,

We got the error 'AttributeError: module ‘tensorflow_core.contrib’ has no attribute ‘tensorrt’ as tested GitHub - NVIDIA-AI-IOT/tf_trt_models: TensorFlow models accelerated with NVIDIA TensorRT.

Which pip wheel of TensorFlow 1.7+ (with TensorRT support) (step3 of setup) should I install?

Or is there any script to Optimize tensor flow model with TensorRT directly?

I’m trying to test a tensorflow model on AGX Xavier with TensorFlow v1.15,

Thank you for any advice,

HuiW · April 7, 2020, 9:25am

Hi

If there is any further message would help on this issue,
please feel free to let me know.

Thank you,

AastaLLL · April 8, 2020, 7:45am

Hi,

Sorry for the late reply.
The GitHub you shared is using JetPack3.2 and TensorFlow 1.8.0.
For a more recent TensorFlow to TensorRT sample, it’s recommended to use this one:

/usr/src/tensorrt/samples/sampleUffSSD/

Or our tutorial:

More, could you share what kind of model do you use?

It looks like the model is not Keras based but TensorFlow now.
Could you share it with us?

Thanks.

HuiW · April 9, 2020, 9:17am

Hi AastaLLL,

Thank you for your support.

Running sample_uff_ssd took about one minute to inference the SSD model.
(set to maxn and enable jetson_clock)
Is it normal? ( attached the log)
Do we need to optimize?

Does AastaNV/TRT_object_detection work on Jetpack4.3?
Due to our customer has one xavier with Jetpack4.3 setting now.

We prefer using keras model.
Attached please find a cnn keras and a inception tensorflow model.

By the way, is it possible to test inception model with sample_uff_ssd?
Also, do we need a model which is build with previous version of tensorflow2.0?

Thank you,

msg_ok_p.log (15.7 KB

Failed to upload model files, there is only jpg and log files can be uploaded.
Even zip or rename could not be uploaded.

https://drive.google.com/drive/folders/1z_lICNms-eZnJVc6kmpOz57vMs310O9K

AastaLLL · April 10, 2020, 7:14am

Hi,

You can find our TensorRT workflow here:

The first step in sample_uff_ssd is to convert the model into TensorRT engine.
This step takes time since TensorRT analysis the network architecture and choose an optimal implement for each layer based on GPU architecture.
This step only need to be done once. Usually, we serialize the engine for next time usage.

TRT_object_detection can work on JetPack4.3.

And you will need a model from TensorFlow 1.1x. The TF v2.0 model support will be available in TensorRT v7.0.

Thanks.

HuiW · April 13, 2020, 11:44am

Hi AastaLLL,

Thank you for your prompt support.

I will upload Tensorflow 1.1x model as it’s ready.

I got error as running python3 main.py. (error message below)
The thing is I could not find a way to install tensorRT on agx.
Neither SDKManeger nor manually install nv-tensorrt-repo-ubuntu1804-cuda10.0-trt6.0.1.5-ga-20190913_1-1_amd64.deb

error message:
nva@nva-desktop:~/TRT_object_detection$ python3 main.py dog-yawning.jpg
2020-04-13 19:31:10.384540: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0
Traceback (most recent call last):
File “main.py”, line 18, in
ctypes.CDLL(“lib/libflattenconcat.so”)
File “/usr/lib/python3.6/ctypes/init.py”, line 348, in init
self._handle = _dlopen(self._name, mode)
OSError: libnvinfer.so.5: cannot open shared object file: No such file or directory

Thank you for any advice,

Topic		Replies	Views
TF-TRT issue Jetson TX2	26	3831	October 18, 2021
TF-TRT conversion is broken on 32.7.1 Jetson AGX Xavier tensorflow , docker	11	1586	April 6, 2022
TensorFlow 1.11.0 wheel with JetPack 3.3 Jetson TX2	103	45385	November 13, 2019
Calibration failed: INTERNAL: Failed to build TensorRT engine (INT8 precision mode) in Jetson Xavier NX (16GB) Jetson Xavier NX tensorrt	9	752	April 12, 2023
Tensorrt fails for custom ssd_inception Model TensorRT	18	2805	May 14, 2020
Official TensorFlow for Jetson AGX Xavier Jetson AGX Xavier kb	97	41801	September 5, 2023
Cannot import TF 2.6.0 correctly on Xavier NX Jetson Xavier NX tensorflow	27	4571	December 29, 2021
[ASK] How to make tensor RT engine from frozen graph tensor flow? Jetson Nano	14	2095	October 14, 2021
TensorFlow Issue - 'NonMaxSuppressionV3' in binary Jetson TX2	16	3150	October 18, 2021
TF-TRT Error on Jetson Nano TensorRT tensorrt , nano	2	2124	August 26, 2021

Create inference graph failed on Agx Xavier

Related topics