Create inference graph failed on Agx Xavier

Create inference graph failed on Agx Xavier

We like to inference our tensorflow models which are built on Keras on Agx Xavier.
And got error while transfered a h5 model to inference graph.

Here is the error message.
Traceback (most recent call last):
File “trt.py”, line 106, in
print (infer(test, tf_sess, inp, out) )
File “trt.py”, line 88, in infer
y_batch = sess.run(output_tensor, feed_dict={input_tensor:x_batch})
File “/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py”, line 956, in run
run_metadata_ptr)
File “/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py”, line 1156, in _run
(np_val.shape, subfeed_t.name, str(subfeed_t.get_shape())))
ValueError: Cannot feed value of shape (1, 28, 28, 1) for Tensor ‘conv2d_1/kernel:0’, which has shape ‘(3, 3, 1, 32)’

To use tensorflow2.0, we installed jetpack4.3 and tensowflow2.0.

Thank you for any advice,
tf_trt_transfer_fail_msg.txt (15.2 KB)

Hi,

The error indicates the input size is incorrect.
If this model can be executed with Keras, this error looks abnormal to me.

Would you mind share your trt.py for us debugging?

Thanks.

Hi AastaLLL,

Thank you for your prompt support.
Attached please find trt.py

Thank you,
trt.py.txt (3.6 KB)

Hi,

Do you need further information?
Is there any suggestion?

Thank you,

Hi,

Sorry for keeping you waiting.
Could you also share cnn.h5 file for us checking?

Thanks.

Hi AastaLLL,

Attached please find the files.

Thank you,

Hi,

We try to dump the network layer, input and output connection of cnn.h5:
Some of layers are not connected, and one of them is the error node.

Please check if there is any issue when converting the Keras network into TensorFlow graph first.

[b](<tf.Tensor 'conv2d_1/kernel:0' shape=(3, 3, 1, 32) dtype=float32>,)
(<tf.Tensor 'conv2d_1/bias:0' shape=(32,) dtype=float32>,)
(<tf.Tensor 'conv2d_2/kernel:0' shape=(3, 3, 32, 64) dtype=float32>,)
(<tf.Tensor 'conv2d_2/bias:0' shape=(64,) dtype=float32>,)
(<tf.Tensor 'dense_1/kernel:0' shape=(9216, 128) dtype=float32>,)
(<tf.Tensor 'dense_1/bias:0' shape=(128,) dtype=float32>,)
(<tf.Tensor 'dense_2/kernel:0' shape=(128, 10) dtype=float32>,)
(<tf.Tensor 'dense_2/bias:0' shape=(10,) dtype=float32>,)
(<tf.Tensor 'image_tensor_x:0' shape=(?, 28, 28, 1) dtype=float32>,)[/b]
(<tf.Tensor 'sequential/flatten_1/Reshape/shape:0' shape=(2,) dtype=int32>,)
(<tf.Tensor 'sequential/conv2d_1/Conv2D-0-PermConstNHWCToNCHW-LayoutOptimizer:0' shape=(4,) dtype=int32>,)
(<tf.Tensor 'sequential/max_pooling2d_1/MaxPool-0-0-PermConstNCHWToNHWC-LayoutOptimizer:0' shape=(4,) dtype=int32>,)
(<tf.Tensor 'sequential/conv2d_1/Conv2D-0-TransposeNHWCToNCHW-LayoutOptimizer:0' shape=(?, 1, 28, 28) dtype=float32>,)
    Input: Tensor("image_tensor_x:0", shape=(?, 28, 28, 1), dtype=float32)
    Input: Tensor("sequential/conv2d_1/Conv2D-0-PermConstNHWCToNCHW-LayoutOptimizer:0", shape=(4,), dtype=int32, device=/job:localhost/replica:0/task:0/device:GPU:0)
(<tf.Tensor 'sequential/conv2d_1/Conv2D:0' shape=(?, 32, 26, 26) dtype=float32>,)
    Input: Tensor("sequential/conv2d_1/Conv2D-0-TransposeNHWCToNCHW-LayoutOptimizer:0", shape=(?, 1, 28, 28), dtype=float32, device=/job:localhost/replica:0/task:0/device:GPU:0)
    Input: Tensor("conv2d_1/kernel:0", shape=(3, 3, 1, 32), dtype=float32)
(<tf.Tensor 'sequential/conv2d_1/BiasAdd:0' shape=(?, 32, 26, 26) dtype=float32>,)
    Input: Tensor("sequential/conv2d_1/Conv2D:0", shape=(?, 32, 26, 26), dtype=float32)
    Input: Tensor("conv2d_1/bias:0", shape=(32,), dtype=float32)
(<tf.Tensor 'sequential/conv2d_1/Relu:0' shape=(?, 32, 26, 26) dtype=float32>,)
    Input: Tensor("sequential/conv2d_1/BiasAdd:0", shape=(?, 32, 26, 26), dtype=float32)
(<tf.Tensor 'sequential/conv2d_2/Conv2D:0' shape=(?, 64, 24, 24) dtype=float32>,)
    Input: Tensor("sequential/conv2d_1/Relu:0", shape=(?, 32, 26, 26), dtype=float32)
    Input: Tensor("conv2d_2/kernel:0", shape=(3, 3, 32, 64), dtype=float32)
(<tf.Tensor 'sequential/conv2d_2/BiasAdd:0' shape=(?, 64, 24, 24) dtype=float32>,)
    Input: Tensor("sequential/conv2d_2/Conv2D:0", shape=(?, 64, 24, 24), dtype=float32)
    Input: Tensor("conv2d_2/bias:0", shape=(64,), dtype=float32)
(<tf.Tensor 'sequential/conv2d_2/Relu:0' shape=(?, 64, 24, 24) dtype=float32>,)
    Input: Tensor("sequential/conv2d_2/BiasAdd:0", shape=(?, 64, 24, 24), dtype=float32)
(<tf.Tensor 'sequential/max_pooling2d_1/MaxPool:0' shape=(?, 64, 12, 12) dtype=float32>,)
    Input: Tensor("sequential/conv2d_2/Relu:0", shape=(?, 64, 24, 24), dtype=float32)
(<tf.Tensor 'sequential/max_pooling2d_1/MaxPool-0-0-TransposeNCHWToNHWC-LayoutOptimizer:0' shape=(?, 12, 12, 64) dtype=float32>,)
    Input: Tensor("sequential/max_pooling2d_1/MaxPool:0", shape=(?, 64, 12, 12), dtype=float32)
    Input: Tensor("sequential/max_pooling2d_1/MaxPool-0-0-PermConstNCHWToNHWC-LayoutOptimizer:0", shape=(4,), dtype=int32, device=/job:localhost/replica:0/task:0/device:GPU:0)
(<tf.Tensor 'sequential/flatten_1/Reshape:0' shape=(?, 9216) dtype=float32>,)
    Input: Tensor("sequential/max_pooling2d_1/MaxPool-0-0-TransposeNCHWToNHWC-LayoutOptimizer:0", shape=(?, 12, 12, 64), dtype=float32, device=/job:localhost/replica:0/task:0/device:GPU:0)
    Input: Tensor("sequential/flatten_1/Reshape/shape:0", shape=(2,), dtype=int32)
(<tf.Tensor 'sequential/dense_1/MatMul:0' shape=(?, 128) dtype=float32>,)
    Input: Tensor("sequential/flatten_1/Reshape:0", shape=(?, 9216), dtype=float32)
    Input: Tensor("dense_1/kernel:0", shape=(9216, 128), dtype=float32)
(<tf.Tensor 'sequential/dense_1/BiasAdd:0' shape=(?, 128) dtype=float32>,)
    Input: Tensor("sequential/dense_1/MatMul:0", shape=(?, 128), dtype=float32)
    Input: Tensor("dense_1/bias:0", shape=(128,), dtype=float32)
(<tf.Tensor 'sequential/dense_1/Relu:0' shape=(?, 128) dtype=float32>,)
    Input: Tensor("sequential/dense_1/BiasAdd:0", shape=(?, 128), dtype=float32)
(<tf.Tensor 'sequential/dense_2/MatMul:0' shape=(?, 10) dtype=float32>,)
    Input: Tensor("sequential/dense_1/Relu:0", shape=(?, 128), dtype=float32)
    Input: Tensor("dense_2/kernel:0", shape=(128, 10), dtype=float32)
(<tf.Tensor 'sequential/dense_2/BiasAdd:0' shape=(?, 10) dtype=float32>,)
    Input: Tensor("sequential/dense_2/MatMul:0", shape=(?, 10), dtype=float32)
    Input: Tensor("dense_2/bias:0", shape=(10,), dtype=float32)
(<tf.Tensor 'sequential/dense_2/Softmax:0' shape=(?, 10) dtype=float32>,)
    Input: Tensor("sequential/dense_2/BiasAdd:0", shape=(?, 10), dtype=float32)

Thanks.

Hi AastaLLL,

Thank you for your support.
Actually the model we provided is from keras website sample.

Would you address more about " converting the Keras network into TensorFlow graph first?"
We do not know how to do it.

By the way, do you expect a tensowflow model from tensorflow not keras?

For speed requirement, we set up Agx Xavier with tensor flow 2.0.
All we want is to use any tensowflow model with tensor RT on Agx.

After that, we can implement customer’s tensflow model.
If the model we provided does not work, is there any tensowflow2.0 model we can test it?

Thank you for any advice,

Hi AastaLLL,

We failed to create frozen graphs as tried to build " ```
tf_to_trt_image_classification (ps: itwas built on R28.2.)

I attached a cnn model which is built by tensorflow.
Would you help check if the model could be transfer to tensowRT?

Or is there a reference to build up  'tf_to tft_image_classfication?"
here are the versions I used.
And the error message.
tensorflow_gpu-2.0.0+nv20.1-cp36-cp36m-linux_aarch64.whl

nva@nva-desktop:~/tf_to_trt_image_classification/build$ make
[ 16%] Building NVCC (Device) object examples/classify_image/CMakeFiles/classify_image.dir/classify_image_generated_classify_image.cu.o
/usr/include/opencv4/opencv2/stitching/detail/warpers.hpp(213): warning: overloaded virtual function "cv::detail::PlaneWarper::buildMaps" is only partially overridden in class "cv::detail::AffineWarper"

/usr/include/opencv4/opencv2/stitching/detail/warpers.hpp(213): warning: overloaded virtual function "cv::detail::PlaneWarper::warp" is only partially overridden in class "cv::detail::AffineWarper"

/usr/include/opencv4/opencv2/stitching/detail/blenders.hpp(100): warning: overloaded virtual function "cv::detail::Blender::prepare" is only partially overridden in class "cv::detail::FeatherBlender"

/usr/include/opencv4/opencv2/stitching/detail/blenders.hpp(127): warning: overloaded virtual function "cv::detail::Blender::prepare" is only partially overridden in class "cv::detail::MultiBandBlender"

/home/nva/tf_to_trt_image_classification/examples/classify_image/classify_image.cu(95): error: identifier "CV_LOAD_IMAGE_COLOR" is undefined

1 error detected in the compilation of "/tmp/tmpxft_00002c2f_00000000-6_classify_image.cpp1.ii".
CMake Error at classify_image_generated_classify_image.cu.o.cmake:279 (message):
  Error generating file
  /home/nva/tf_to_trt_image_classification/build/examples/classify_image/CMakeFiles/classify_image.dir//./classify_image_generated_classify_image.cu.o


examples/classify_image/CMakeFiles/classify_image.dir/build.make:484: recipe for target 'examples/classify_image/CMakeFiles/classify_image.dir/classify_image_generated_classify_image.cu.o' failed
make[2]: *** [examples/classify_image/CMakeFiles/classify_image.dir/classify_image_generated_classify_image.cu.o] Error 1
CMakeFiles/Makefile2:103: recipe for target 'examples/classify_image/CMakeFiles/classify_image.dir/all' failed
make[1]: *** [examples/classify_image/CMakeFiles/classify_image.dir/all] Error 2
Makefile:83: recipe for target 'all' failed
make: *** [all] Error 2
nva@nva-desktop:~/tf_to_trt_image_classification/build$ 

Again, all we want is to use any tensowflow model with tensor RT on Agx.
We need to check the performance of running tensflow model on agx is better than Intel x86 platform without gpu card.  
If tensowflow20 is not a proper version to use on agx, would  you advice a version.

Thank you,

Hi,

Sorry for keeping you waiting.

1. The input of TensorRT parser is the frozen .pb file.
Please convert the model into .pb either from TensorFlow or Keras.

2. Please noticed that we add TFv2.0 support from TensorRT v7.0.
And TensorRT v7.0 is not public available for Jetson yet.

Thanks.

Hi AastaLLL,

Thank you so much for your support.

So AGX Xavier does not support TFv2.0 with Tensor RT now.
Which TF version would you suggest using on AGX Xavier to get high performance now?

Thank you,

Hi,

You can try TensorFlow v1.15, which can be installed via our official package directly:
https://developer.download.nvidia.com/compute/redist/jp/v43/tensorflow-gpu/

Thanks

Hi AastaLLL,

Thank you for your prompt support.
We will try Tensorflow1.15 first.
By the way, is there any schedule of TensorRT v7,0 to support Jetson?

Thanks,

Hi HuiW,

The next JetPack is scheduled to be released in coming weeks, please wait for our announcement.

Thanks

Hi,

We got the error 'AttributeError: module ‘tensorflow_core.contrib’ has no attribute ‘tensorrt’ as tested https://github.com/NVIDIA-AI-IOT/tf_trt_models.

Which pip wheel of TensorFlow 1.7+ (with TensorRT support) (step3 of setup) should I install?

Or is there any script to Optimize tensor flow model with TensorRT directly?

I’m trying to test a tensorflow model on AGX Xavier with TensorFlow v1.15,

Thank you for any advice,

Hi

If there is any further message would help on this issue,
please feel free to let me know.

Thank you,

Hi,

Sorry for the late reply.
The GitHub you shared is using JetPack3.2 and TensorFlow 1.8.0.
For a more recent TensorFlow to TensorRT sample, it’s recommended to use this one:

/usr/src/tensorrt/samples/sampleUffSSD/

Or our tutorial:

More, could you share what kind of model do you use?

It looks like the model is not Keras based but TensorFlow now.
Could you share it with us?

Thanks.

Hi AastaLLL,

Thank you for your support.

Running sample_uff_ssd took about one minute to inference the SSD model.
(set to maxn and enable jetson_clock)
Is it normal? ( attached the log)
Do we need to optimize?

Does AastaNV/TRT_object_detection work on Jetpack4.3?
Due to our customer has one xavier with Jetpack4.3 setting now.

We prefer using keras model.
Attached please find a cnn keras and a inception tensorflow model.

By the way, is it possible to test inception model with sample_uff_ssd?
Also, do we need a model which is build with previous version of tensorflow2.0?

Thank you,

msg_ok_p.log (15.7 KB

Failed to upload model files, there is only jpg and log files can be uploaded.
Even zip or rename could not be uploaded.

https://drive.google.com/drive/folders/1z_lICNms-eZnJVc6kmpOz57vMs310O9K

Hi,

You can find our TensorRT workflow here:
https://docs.nvidia.com/deeplearning/sdk/tensorrt-developer-guide/index.html#overview

The first step in sample_uff_ssd is to convert the model into TensorRT engine.
This step takes time since TensorRT analysis the network architecture and choose an optimal implement for each layer based on GPU architecture.
This step only need to be done once. Usually, we serialize the engine for next time usage.

TRT_object_detection can work on JetPack4.3.

And you will need a model from TensorFlow 1.1x. The TF v2.0 model support will be available in TensorRT v7.0.

Thanks.

Hi AastaLLL,

Thank you for your prompt support.

I will upload Tensorflow 1.1x model as it’s ready.

I got error as running python3 main.py. (error message below)
The thing is I could not find a way to install tensorRT on agx.
Neither SDKManeger nor manually install nv-tensorrt-repo-ubuntu1804-cuda10.0-trt6.0.1.5-ga-20190913_1-1_amd64.deb

error message:
nva@nva-desktop:~/TRT_object_detection$ python3 main.py dog-yawning.jpg
2020-04-13 19:31:10.384540: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0
Traceback (most recent call last):
File “main.py”, line 18, in
ctypes.CDLL(“lib/libflattenconcat.so”)
File “/usr/lib/python3.6/ctypes/init.py”, line 348, in init
self._handle = _dlopen(self._name, mode)
OSError: libnvinfer.so.5: cannot open shared object file: No such file or directory

Thank you for any advice,