incompatible with expected resource

cyz0202 · April 12, 2019, 2:42am

hi, I use the transformer model in OpenSeq2Seq to run the built-in example, en-de machine translation.

I follow the tutorial of OpenSeq2Seq except using transformer model instead of nmt.

I can train the model and do inference with tensorflow 13.1, cuda10.1. But I can’t do inference with tensorrt 5 of tensorflow13.1.

The main error is:

ValueError: Input 0 of node ForwardPass/transformer_decoder/decode/while/layer_0/self_attention/self_attention/q/Tensordot/ReadVariableOp/Enter was passed float from ForwardPass/transformer_decoder/layer_0/self_attention/self_attention/q/kernel:0 incompatible with expected resource.

And the whole error log is :

Traceback (most recent call last):
  File "/home/xxx/pycharm_proj/OpenSeq2Seq_raw/run.py", line 101, in <module>
    main()
  File "/home/xxx/pycharm_proj/OpenSeq2Seq_raw/run.py", line 81, in main
    args, base_config, config_module, base_model, hvd, checkpoint)
  File "/home/xxx/pycharm_proj/OpenSeq2Seq_raw/open_seq2seq/utils/utils.py", line 790, in create_model
    model.compile(checkpoint=checkpoint)
  File "/home/xxx/pycharm_proj/OpenSeq2Seq_raw/open_seq2seq/models/model.py", line 445, in compile
    checkpoint=checkpoint
  File "/home/xxx/pycharm_proj/OpenSeq2Seq_raw/open_seq2seq/models/model.py", line 645, in build_trt_forward_pass_graph
    maximum_cached_engines=trt_params["trt_maximum_cached_engines"]
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/contrib/tensorrt/python/trt_convert.py", line 333, in create_inference_graph
    importer.import_graph_def(input_graph_def, name="")
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/util/deprecation.py", line 507, in new_func
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/importer.py", line 430, in import_graph_def
    raise ValueError(str(e))
ValueError: Input 0 of node ForwardPass/transformer_decoder/decode/while/layer_0/self_attention/self_attention/q/Tensordot/ReadVariableOp/Enter was passed float from ForwardPass/transformer_decoder/layer_0/self_attention/self_attention/q/kernel:0 incompatible with expected resource.

Process finished with exit code 1

SOLUTION: I find a solution from the network by modifying some node.op, such as RefSwitch, AssignSub, AssignAdd, to Switch, Sub, Add respectively. But it is useless. The modification happens as follows:

# Restore checkpoint here because we have to freeze the graph
        tf_saver = tf.train.Saver()
        tf_saver.restore(save_path=checkpoint, sess=tf_sess)
        
        # I ALSO TRY TO ADD THE MODIFICATION HERE. BUT USELESS.#

        frozen_graph = tf.graph_util.convert_variables_to_constants(
            tf_sess,
            tf_sess.graph_def,
            output_node_names=output_node_names
        )
        num_nodes = len(frozen_graph.node)
        print('Converting graph using TensorFlow-TensorRT...')

        # THIS IS THE MODIFICATION 
        # gd = tf_sess.graph.as_graph_def()
        for node in frozen_graph.node:
            if node.op == 'RefSwitch':
                node.op = 'Switch'
                for index in xrange(len(node.input)):
                    if 'moving_' in node.input[index]:
                        node.input[index] = node.input[index] + '/read'
            elif node.op == 'AssignSub':
                node.op = 'Sub'
                if 'use_locking' in node.attr: del node.attr['use_locking']
            elif node.op == 'AssignAdd':
                node.op = 'Add'
                if 'use_locking' in node.attr: del node.attr['use_locking']
        
        # ERROR OCCURS IN THE FOLLOWING FUNCTION.
        frozen_graph = trt.create_inference_graph(
            input_graph_def=frozen_graph,
            outputs=output_node_names,
            max_batch_size=trt_params["batch_size_per_gpu"],
            max_workspace_size_bytes=trt_params["trt_max_workspace_size_bytes"],
            precision_mode=trt_params["trt_precision_mode"],
            minimum_segment_size=trt_params["trt_minimum_segment_size"],
            is_dynamic_op=trt_params["trt_is_dynamic_op"],
            maximum_cached_engines=trt_params["trt_maximum_cached_engines"]
        )

I will be very grateful for any help.

cyz0202 · April 18, 2019, 12:42pm

any help…

tmorris · April 25, 2019, 9:16pm

Please try the fix suggested here: incompatible with expected resource · Issue #407 · NVIDIA/OpenSeq2Seq · GitHub

Tejaswini · June 28, 2019, 1:58pm

I had a similar issue. I was getting the following error:

ValueError: Input 0 of node mobilenetv2_1.00_224/bn_Conv1/cond/ReadVariableOp/Switch was passed float from bn_Conv1/gamma:0 incompatible with expected resource

.

I did the below before loading the model while converting my keras model to a tensorflow graph as per https://stackoverflow.com/a/52823701:

tf.keras.backend.set_learning_phase(0)

This solved my error.
Hope this helps someone with similar issues!

Cheers,
T

ixtiyoruz312 · July 8, 2019, 3:37am

I had a similar issue. I was getting the following error:
ValueError: Input 0 of node mobilenetv2_1.00_224/bn_Conv1/cond/ReadVariableOp/Switch was passed float from bn_Conv1/gamma:0 incompatible with expected resource
.

I did the below before loading the model while converting my keras model to a tensorflow graph as per https://stackoverflow.com/a/52823701:
tf.keras.backend.set_learning_phase(0)
This solved my error.
Hope this helps someone with similar issues!

Cheers,
T

Thanks, that worked for me as well.

alonzo.magaly · July 29, 2019, 10:29am

Hi,

Sorry for coming back to this but I am issuing the same thing and the proposed workaround do not work in my case.
Here is the context:
I am creating an LSTM using Keras and training it.
I’d like to di the inference on a jetson Xavier (not in my possession).
I’m testing the inference engine creator on a Linux 16.04 to send it to the person that will convert the model for inference on the jetson.
tf version is 13.1
trt is installed along with tf-GPU.

The model architecture is the following:

model = Sequential()
model.add(LSTM(MODEL['model_layers'][0], input_shape=(input_size), dropout=0))
model.add(Dropout(MODEL['dropout']))
model.add(Dense(MODEL['dense_layers'][0], activation=MODEL['model_activation']))
model.add(Dense(len(g.LABEL_LIST), activation=MODEL['output_activation'], name='out'))

I tried to use the hdf5 and JSON directly or to load and freeze them in pb format and load them again.

Please find here the piece of code I am using from the pb file to load and convert the model:

def meta_to_trt(frozen_graph_path, bs, p_mode, seg_size, ws_size):
    """
    Create a trt graph from a frozen one
    frozen_graph_path (str): path to a frozen model in .pb format obtained with freeze_**_graph.
    bs (int): batch_size
    p_mode (str):
    seg_size (int): segment size
    ws_size (list)
    
    """
    # Inference with TF-TRT `SavedModel` workflow:
    output_nodes = ['out/Softmax']
    graph = tf.Graph()
    with graph.as_default():
        tf.keras.backend.set_learning_phase(0)
        with tf.Session() as sess:
            # First create a `Saver` object (for saving and rebuilding a
            # model) and import your `MetaGraphDef` protocol buffer into it:
            saver = tf.train.import_meta_graph("saved_ckpt-0.meta")
            # Then restore your training data from checkpoint files:
            saver.restore(sess, "saved_ckpt-0")
            # Finally, freeze the graph:
            frozen_graph = tf.compat.v1.graph_util.convert_variables_to_constants(
                sess,
                tf.get_default_graph().as_graph_def(),
                output_node_names=output_nodes)
            # Now you can create a TensorRT inference graph from your
            # frozen graph:
            trt_graph = trt.create_inference_graph(
                input_graph_def=frozen_graph,
                outputs=output_nodes,
                is_dynamic_op=False,
                max_batch_size=bs,
                max_workspace_size_bytes=ws_size[0]<<ws_size[1],
                precision_mode=p_mode)

As you can see this is very close to what is proposed in the documentation.
The model outputs the following:

/…/anaconda3/envs/sensor_har/lib/python3.6/site-packages/tensorflow/contrib/tensorrt/init.py
INFO:tensorflow: Restoring parameters from saved_ckpt-0
INFO:tensorflow: Froze 7 variables.
INFO:tensorflow: Converted 7 variables to const ops.
INFO:tensorflow: Running against TensorRT version 1164267952.32754.1164267952

ValueError: Input 0 of node lstm/while/ReadVariableOp/Enter was passed float from lstm/kernel:0 incompatible with expected resource.

Any advice there welcome :)
Cheers

M

ta2184 · August 1, 2019, 9:37pm

In addition to setting tf.keras.backend.set_learning_phase(0) before loading the model, I converted my graph using the

saved_model_builder.SavedModelBuilder()

as shown here: https://www.tensorflow.org/api_docs/python/tf/saved_model/Builder

Let me know if it works!

Cheers!
T

abernathi · August 2, 2019, 4:06pm

Having the exact same issue as alonzo.magaly. I have a successfully frozen graph with LSTMs but when I try passing it into the converter and then calling convert, it spits out the exact same error he’s having.

In addition to setting tf.keras.backend.set_learning_phase(0) before loading the model, I converted my graph using the
saved_model_builder.SavedModelBuilder()
as shown here: https://www.tensorflow.org/api_docs/python/tf/saved_model/Builder

I also tried doing it this way but I’m still having errors:

builder = tf.saved_model.builder.SavedModelBuilder('graph_save-1')
with tf.Session() as sess:
    ...
    builder.add_meta_graph_and_variables(sess, ["mytag"], clear_devices=True)

builder.save()

Which creates a graph_save-1 folder, containing a saved_model.pb and variables subfolder. I then start up another python session for tensorrt, and point it to the graph_save-1 folder like so:

converter = trt.TrtGraphConverter(input_saved_model_dir="graph_save-1", input_saved_model_tags=['mytag'], nodes_blacklist=['outputnode'])

But I get the error “Failed to import metagraph, see error log for details” (which I don’t know where the error log even is - no new files are created). Any other ideas? I did set tf.keras.backend.set_learning_phase(0) before calling the model.

ta2184 · August 2, 2019, 4:19pm

How are you creating the tensorrt inference graph?
Using trt.create_inference_graph and passing the saved model .pb folder path to input_saved_model_dir argument, worked for me.

abernathi · August 2, 2019, 6:08pm

I was just using the converter.convert() function, but trying to use trt.create_inference_graph still results in the same error:

from tensorflow.python.compiler.tensorrt import trt_convert as trt
trt_graph = trt.create_inference_graph(None, outputs=['output'], input_saved_model_dir="graph_save-1", input_saved_model_tags=["mytag"])

When I try running this, it gives me the following error:

tensorflow.python.framework.errors_impl.InvalidArgumentError: Failed to import metagraph, check error log for more info.

I found that this was because it was unable to find the output node, and so I verified that my graph is saving it by printing out a list of all the nodes in the graph. Sure enough, it’s there but tensorrt just can’t find it for some reason:

name: "output"
op: "Reshape"
input: "dense/BiasAdd"
input: "output/shape"
attr {
  key: "T"
  value {
    type: DT_FLOAT
  }
}
attr {
  key: "Tshape"
  value {
    type: DT_INT32
  }
}

I also tried passing in a frozen_graph to trt.create_inference_graph, but it just gives me the older error “ValueError: Input 0 of node … incompatible with expected resource.”

Any other ideas?

ta2184 · August 2, 2019, 6:36pm

Usually the output node name should end with :0 so if “output” is the name I would write “import/output:0”. May be it is different for me. But you can try this out or you can also use Tensorboard to view your graph and it will precisely mention the output node name, you need to use that exact name.

abernathi · August 7, 2019, 2:48pm

I tried adding the :0 to the output nodes but then it said it couldn’t find that node so I don’t think this library requires you to append that to the end of the op to get the tensor.

After some more looking around (and plenty of debugging), I think the issue is that tftrt errors out on ops that it can’t support (rather than just skipping and optimizing the ones it does support, like I thought). It seems to dislike the while loop op. I suspect the RNN ops that are supported by tensorRT require the RNN to be unrolled to get rid of any loops and flatten the network. Unfortunately, the network that I’m testing has an additional while_loop inside a custom Keras cell that I can’t replace with any other tensorrt supported op. Thanks for the help, though!

cerrie · August 16, 2019, 2:25am

I had a similar issue. I was getting the following error:
ValueError: Input 0 of node mobilenetv2_1.00_224/bn_Conv1/cond/ReadVariableOp/Switch was passed float from bn_Conv1/gamma:0 incompatible with expected resource
.

I did the below before loading the model while converting my keras model to a tensorflow graph as per https://stackoverflow.com/a/52823701:
tf.keras.backend.set_learning_phase(0)
This solved my error.
Hope this helps someone with similar issues!

Cheers,
T

Thanks very much, that worked for me as well.

Tejaswini · August 16, 2019, 12:52pm

Awesome! Glad it helped!

Cheers,
T

Topic		Replies	Views
create_inference_graph error TensorRT	6	3405	October 12, 2021
ValueError: Input 1 of node StatefulPartitionedCall was passed float from conv2d/kernel:0 incompatible with expected resource. TensorRT	5	5047	April 13, 2021
[TensorRT] ERROR: Parameter check failed at: Utils.cpp::reshapeWeights::71, condition: input.values != nullptr TensorRT	13	5621	October 12, 2021
Failure in verifying input shapes: Input shapes are inconsistent on the batch dimension TensorRT	8	1198	July 11, 2021
TensorRT 4.0 UFF parser fails to parse Keras Resnet50 TensorRT	7	3649	September 21, 2018
use tensorflow tensorrt API convert failed TensorRT	7	2949	May 2, 2018
Keras .pb model to tensorrt engine conversion TensorRT tensorrt , tf-trt	14	5143	October 7, 2021
I use tensorrt to accelerate my tensorflow object detection code, but it report error as follows TensorRT	11	2640	August 27, 2018
TensorRT4 for tensorpack fast-rcnn model is wrong TensorRT	0	1128	January 31, 2019
Error while optimizing frozen Tensorflow graph TensorRT	4	1164	February 26, 2019

incompatible with expected resource

Related topics