Bert based model optimization with tf-trt on tf v 1.15.2 and tensorrt 5.1

francesco.ciannella · June 2, 2020, 8:46pm

Description

I would like to optimize a BERT based tensorflow model that I have trained on tf 1.15.2. It’s a base BERT with a logistic regression layer on top of it

I am trying to optimize using tf-trt to start with, and I have exported the model in a pb file using tensorflow.

Environment

TensorRT Version:
GPU Type: VX100
Nvidia Driver Version: 440.33.01
CUDA Version: 10.2
CUDNN Version:
Operating System + Version: ubuntu 1804
Python Version (if applicable): 3.6.9
TensorFlow Version (if applicable): 1.15.2
PyTorch Version (if applicable): n/a
Baremetal or Container (if container which image + tag): nvcr.io/nvidia/tensorrt:19.08-py3

Steps To Reproduce

I am running the following commands:

from tensorflow.python.compiler.tensorrt import trt_convert as trt
2020-06-02 18:56:32.762315: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libnvinfer.so.5
2020-06-02 18:56:32.763013: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libnvinfer_plugin.so.5

converter = trt.TrtGraphConverter(input_saved_model_dir=“./1588775388”)
2020-06-02 18:56:35.956493: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libnvinfer.so.5
INFO:tensorflow:Linked TensorRT version: (5, 1, 5)
INFO:tensorflow:Loaded TensorRT version: (5, 1, 5)
INFO:tensorflow:Running against TensorRT version 5.1.5

converter.convert()
After a while I get the following;

2020-06-02 18:56:58.223777: E tensorflow/compiler/tf2tensorrt/convert/convert_graph.cc:101] Could not find any TF GPUs
2020-06-02 18:56:58.223852: W tensorflow/compiler/tf2tensorrt/convert/convert_graph.cc:723] Can’t identify the cuda device. Running on device 0
2020-06-02 18:56:58.223959: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libnvinfer.so.5
2020-06-02 18:56:58.413018: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libnvinfer_plugin.so.5
2020-06-02 18:57:03.623745: I tensorflow/compiler/tf2tensorrt/convert/convert_graph.cc:734] TensorRT node TRTEngineOp_0 added for segment 0 consisting of 27 nodes succeeded.
…

and at the end I get this

2020-06-02 18:57:16.998789: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:788] constant_folding: Graph size after: 45 nodes (0), 48 edges (0), time = 35.247ms.
2020-06-02 18:57:16.998804: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:788] TensorRTOptimizer: Graph size after: 45 nodes (0), 48 edges (0), time = 3.794ms.
2020-06-02 18:57:16.998815: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:788] constant_folding: Graph size after: 45 nodes (0), 48 edges (0), time = 36.346ms.
[libprotobuf ERROR google/protobuf/io/zero_copy_stream_impl_lite.cc:155] Cannot allocate buffer larger than kint32max for StringOutputStream.

==============

Could you help me figure out the issue?

Also is there any other avenue I could follow top optimize the model, other than tf-trt?

SunilJB · June 3, 2020, 4:41am

Can you try solution recommended in below link:

github.com/tensorflow/tensorflow

Loading embeddings into the graph fails with: libprotobuf ERROR google/protobuf/io/zero_copy_stream_impl_lite.cc 164 Cannot allocate buffer larger than kint32max for StringOutputStream

opened 02:10AM - 27 Jul 19 UTC

closed 06:01AM - 09 Aug 19 UTC

vitalyli

type:bug comp:keras TF 1.13

**System information** - Have I written custom code (as opposed to using a stoc…k example script provided in TensorFlow): - OS Platform and Distribution (e.g., Linux Ubuntu 16.04): MacOS 10.14 - TensorFlow installed from (source or binary): Binary - TensorFlow version (use command below): 1.13.1 - Python version: Python 3.6.7 |Anaconda **Describe the current behavior** Loading 1.4mil 100dim embeddings into the graph words:1457657; dim:100 **Describe the expected behavior** Want to be able to use TFRecord data set with words and graph to lookup indexes via TF table and then parallel lookup imbedding vectors; compute using 3 dim tensor. This used to work with smaller set of embeddings. Anyway to overcome this problem without rewriting data feed? **Code to reproduce the issue** w_embedding_vocab = tf.constant(embDic.vocab, dtype=tf.string, shape=[embDic.vocab_size], name="w_embedding_vocab") w_embedding_vocab_table = lookup_ops.index_table_from_tensor(w_embedding_vocab, default_value=0, name="word_embidx_tbl") w_embeddings = tf.get_variable(name="word_embeddings", shape=[embDic.vocab_size, embDic.dim], initializer=tf.constant_initializer(np.asmatrix(embDic.embeddings)), dtype=tf.float32, trainable=False) **Other info / logs** No other logs, just one ERROR message [libprotobuf ERROR google/protobuf/io/zero_copy_stream_impl_lite.cc:164] Cannot allocate buffer larger than kint32max for StringOutputStream.

You can use TensorRT to get better performance.
The workflow will be .pb → ONNX → TRT. If any layer is not supported, you need to create a custom plugin.

Thanks

francesco.ciannella · June 23, 2020, 8:53pm

I managed to optimize using tf2onnx and then from onnx to tensorrt. Thanks for your help!

Topic		Replies	Views
ONNX model and TensorRT engine works differently TensorRT	5	755	February 20, 2023
TF-TRT not generating .engine file TensorRT	1	728	May 18, 2022
No SpeedUp after TensorRT INT8 (PointNet ++ tensorflow model) TensorRT	6	1261	February 25, 2020
TensorRT 5 and TensorRT 7 conversion discrepancy TensorRT tensorrt , tensorflow	4	509	September 23, 2020
BERT model and TensorRT TensorRT	0	993	June 25, 2019
TF-TRT graph conversion failed for Tensorflow version 1 TensorRT tensorrt , tensorflow , ubuntu , python , tf-trt	1	828	May 17, 2022
TF-TRT optimization TensorRT tensorrt , tensorflow , jetson-inference	4	4953	June 2, 2021
How to use TensorRT models with Streamlit TensorRT	1	531	July 22, 2022
TensorRT 3: Faster TensorFlow Inference and Volta Support Technical Blog	16	468	December 8, 2020
problem with TFTRT TensorRT	4	1632	January 30, 2019

Bert based model optimization with tf-trt on tf v 1.15.2 and tensorrt 5.1

Description

Environment

Steps To Reproduce

Related topics