create_inference_graph() produced output model with too big size

I have tried to optimize my custom frozen model to run on TensorRT using create_inference_graph(), however, the output was way larger than the original model (my model is around 200MB, but after converting it’s more than 2GB). When I increased minimum_segment_size to 30 or 40, the size was smaller but still slightly bigger than the original one (probably not many segments were converted). Is it normal that the converted model size will be bigger than the original one?

My code is as below:

trt_graph = trt.create_inference_graph(
        input_graph_def=frozen_graph,
        outputs,
        max_batch_size=64,
        max_workspace_size_bytes=1 << 25,
        precision_mode='FP16',
        minimum_segment_size=10
)

Another thing, Because the model was way too big, I couldn’t serialize it to .pb file, so that I had this error:

libprotobuf ERROR external/protobuf_archive/src/google/protobuf/message_lite.cc:289] Exceeded maximum protobuf size of 2GB: 2756916500

Has anyone ever been solving these issues? Thanks a bunch.

This large scale file size change is not expected. To help us debug, can you share the full conversion code, frozen graph and serializing code that demonstrates the above error?

Thanks a lot for your reply.

Download my model [url]https://drive.google.com/open?id=1cyMN34fkqNsWQLL5X4drGL0Ar6YJTE11[/url]. It includes the Tensorflow checkpoint files and conversion code. When running file convert.py, it’ll have the error about exceeding 2GB.

For the model details, this is the repo I’m using to produce the checkpoint: GitHub - DeepRNN/visual_question_answering: Tensorflow implementation of "Dynamic Memory Networks for Visual and Textual Question Answering"

i’m trying to repro now. It’s taking quiet awhile. still running. did you a lot of the following messages?

2019-01-10 20:32:00.986042: W tensorflow/contrib/tensorrt/convert/trt_optimization_pass.cc:185] TensorRTOptimizer is probably called on funcdef! This optimizer must *NOT* be called on function objects.

Ok, finally finished after over an hour. Will triage and keep you updated.

seeing the error:

[libprotobuf ERROR external/protobuf_archive/src/google/protobuf/message_lite.cc:289] Exceeded maximum protobuf size of 2GB: 2799888904
Traceback (most recent call last):
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/importer.py", line 418, in import_graph_def
    graph._c_graph, serialized, options)  # pylint: disable=protected-access
tensorflow.python.framework.errors_impl.InvalidArgumentError: Requested return tensor 'result/ArgMax:0' not found in graph def

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "convert.py", line 82, in <module>
    freeze_graph(args.model_dir, args.output_node_names)
  File "convert.py", line 64, in freeze_graph
    return_elements=['result/ArgMax:0'],
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/util/deprecation.py", line 488, in new_func
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/importer.py", line 422, in import_graph_def
    raise ValueError(str(e))
ValueError: Requested return tensor 'result/ArgMax:0' not found in graph def

I just modified a bit in convert.py https://gist.github.com/sonnguyen64/200a6796efac6deb78d41aaa80819013. It should work now I think. But looking into the errors you posted, the first line was about the problem that the output graph already exceeded the maximum size of protobuf file (2GB) while my model was only around 200MB. It’s a bit weird that the size expanding too much.

I also had the warning messages as you sent, but it didn’t really cause any errors so I ignored them.

2019-01-10 20:32:00.986042: W tensorflow/contrib/tensorrt/convert/trt_optimization_pass.cc:185] TensorRTOptimizer is probably called on funcdef! This optimizer must *NOT* be called on function objects.

Another thing I’d like to ask is the precision mode I set was FP16, isn’t it the model has already been compressed by quantization, so its size is supposed to be smaller? I tried counting the number of operations in graph, a bit strange that the number actually reduced, but model size still larger.

When I tried to inference the model, the time taken was roughly longer than the one without conversion.

In case you still have the error “Requested return tensor ‘result/ArgMax:0’ not found in graph def”, try increasing minimum_segment_size to 50. It should also run faster.

Hello NVES, are there any updates on this problem?

Hello,

we are still reviewing. This seems to be Tensorflow-Tensorrt integration issue. Will keep you updated.

@NVES has there been an update in regards to the Tensorflow-Tensorrt integration issue?

Any update on this? I’m facing the same problem. In my case, I’m using ssd_mobilenet_v2_coco from model Zoo, https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/detection_model_zoo.md. as a pretrained model for training my own dataset.

The original model is around 19MB in size but the converted model is 37MB. I’m on Jetson Nano with tensorflow-gpu 1.13.1+nv19.4.

1 Like

Hello,

I see a similar issue like this with TF2.0 and TRT.

Do we know if this issue is resolved with previous versions of TF?

I think this problem of hitting the protobuf limit is fixed by this commit: Merge pull request #30789 from phillip-kravtsov:trt_remove_serialized… · tensorflow/tensorflow@68edd47 · GitHub

Could you rerun your test to verify.
I tried to download the model from the link you posted by the link seems broken.