ValueError: Input 1 of node StatefulPartitionedCall was passed float from conv2d/kernel:0 incompatible with expected resource.

Platform: NVIDIA Jetson Nano
Jetpack 4.3
Python: 3.6.9
Tensorflow-GPU: 1.15.0+nv20.1.tf1
Keras: 2.3.1
TensorRT: 6.0.1.10

I have been stuck with this issue for last couple of days. Here is my tensorflow network training code, that trains the MNIST dataset and then serializes it both as JSON+H5 and SavedModel TF formats.


from keras.models import model_from_json
from keras import backend as K
from tensorflow.python.platform import gfile
from tensorflow.python.compiler.tensorrt import trt_convert as trt
import tensorflow as tf
import numpy
import time
import os

Downloading the MNIST Dataset:

(x_train, y_train), (x_test, y_test) = tf.keras.datasets.mnist.load_data()

import matplotlib.pyplot as plt

image_index = 7777 # You may select anything up to 60,000
print(y_train[image_index]) # The label is 8

x_train.shape

Reshaping the array to 4-dims so that it can work with the Keras API

x_train = x_train.reshape(x_train.shape[0], 28, 28, 1)
x_test = x_test.reshape(x_test.shape[0], 28, 28, 1)
input_shape = (28, 28, 1)

Making sure that the values are float so that we can get decimal points after division

x_train = x_train.astype(‘float32’)
x_test = x_test.astype(‘float32’)

Normalizing the RGB codes by dividing it to the max RGB value.

x_train /= 255
x_test /= 255
print(‘x_train shape:’, x_train.shape)
print(‘Number of images in x_train’, x_train.shape[0])
print(‘Number of images in x_test’, x_test.shape[0])

tf.keras.backend.set_learning_phase(0)

Creating a Sequential Model and adding the layers

model = tf.keras.models.Sequential()
model.add(tf.keras.layers.Conv2D(28, kernel_size=(3,3), input_shape=[28,28, 1]))
model.add(tf.keras.layers.MaxPool2D(pool_size=(2, 2)))
model.add(tf.keras.layers.Flatten()) # Flattening the 2D arrays for fully connected layers
model.add(tf.keras.layers.Dense(128, activation=tf.nn.relu))
#model.add(tf.keras.layers.Dropout(0.2))
model.add(tf.keras.layers.Dense(10,activation=tf.nn.softmax))

Compiling Model

model.compile(optimizer=‘adam’, loss=‘sparse_categorical_crossentropy’, metrics=[‘accuracy’])

Printing Model Summary:

model.summary()

Training Model:

print(‘Proceeding for Training … !’)
model.fit(x=x_train,y=y_train, epochs=5)
print(“MNIST Model Training Complete !!!”)

print(“Serializing Model to JSON file along with HD5F Weights file !!!”)

Serialize model to JSON

model_json = model.to_json()
with open(“mnist_model.json”, “w”) as json_file:
json_file.write(model_json)

serialize weights to HDF5

model.save_weights(“mnist_weights.h5”)
model.save(’./tftrt_model/mnist_completemodel’, save_format=‘tf’)
print("[INFO] >>> Saved model to Disk in all formats !")

print(“Proceeding for model Evaluation … !”)
modelEval = model.evaluate(x_test, y_test)
print(“MNIST Model Evaluation Complete !!!”)
print(modelEval)

Now when I use the following script in a separate file:


from keras.models import model_from_json
from keras import backend as K
from tensorflow.python.platform import gfile
from tensorflow.python.compiler.tensorrt import trt_convert as trt
import tensorflow as tf
import numpy
import time
import os

Loading SavedModel format from Disk:

print (’[INFO] >>> Loading MNIST SavedModel from Disk !’)
model_directory_path = ‘./tftrt_model/mnist_completemodel’

print(’[INFO] >>> Proceeding for TF-TRT Graph Conversion/Optimization of TF Graph in Frozen mode … ‘)
converter = trt.TrtGraphConverter(input_saved_model_dir=model_directory_path, input_saved_model_signature_key=‘serving_default’)
converter.convert()
print(’[INFO] >>> TF Graph converted to TF-TRT Graph. Proceeding to save in specified directory !’)
save_directory = ‘./tftrt_model/mnist_tftrt’
converter.save(save_directory)
print(’[INFO] >>> TF-TRT Graph saved sucessfully in specified directory !: ', save_directory)

and execute to load the TF SavedModel and optimize it using TF-TRT, it fails to issuing the following error details:


2020-02-25 14:02:05.033077: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1304] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 125 MB memory) -> physical GPU (device: 0, name: NVIDIA Tegra X1, pci bus id: 0000:00:00.0, compute capability: 5.3)
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/compiler/tensorrt/trt_convert.py:494: load (from tensorflow.python.saved_model.loader_impl) is deprecated and will be removed in a future version.
Instructions for updating:
This function will only be available through the v1 compatibility library as tf.compat.v1.saved_model.loader.load or tf.compat.v1.saved_model.load. There will be a new function for importing SavedModels in Tensorflow 2.0.
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/compiler/tensorrt/trt_convert.py:517: convert_variables_to_constants (from tensorflow.python.framework.graph_util_impl) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.compat.v1.graph_util.convert_variables_to_constants
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/graph_util_impl.py:277: extract_sub_graph (from tensorflow.python.framework.graph_util_impl) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.compat.v1.graph_util.extract_sub_graph
2020-02-25 14:02:07.898188: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:950] ARM64 does not support NUMA - returning NUMA node zero
2020-02-25 14:02:07.898352: I tensorflow/core/grappler/devices.cc:55] Number of eligible GPUs (core count >= 8, compute capability >= 0.0): 0
2020-02-25 14:02:07.903812: I tensorflow/core/grappler/clusters/single_machine.cc:356] Starting new session
2020-02-25 14:02:07.904952: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:950] ARM64 does not support NUMA - returning NUMA node zero
2020-02-25 14:02:07.905097: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties:
name: NVIDIA Tegra X1 major: 5 minor: 3 memoryClockRate(GHz): 0.9216
pciBusID: 0000:00:00.0
2020-02-25 14:02:07.905191: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0
2020-02-25 14:02:07.905254: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10.0
2020-02-25 14:02:07.905313: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10.0
2020-02-25 14:02:07.905361: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10.0
2020-02-25 14:02:07.905408: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10.0
2020-02-25 14:02:07.905454: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10.0
2020-02-25 14:02:07.905498: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-02-25 14:02:07.905663: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:950] ARM64 does not support NUMA - returning NUMA node zero
2020-02-25 14:02:07.905859: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:950] ARM64 does not support NUMA - returning NUMA node zero
2020-02-25 14:02:07.905937: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0
2020-02-25 14:02:07.906005: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1159] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-02-25 14:02:07.906040: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1165] 0
2020-02-25 14:02:07.906068: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1178] 0: N
2020-02-25 14:02:07.906244: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:950] ARM64 does not support NUMA - returning NUMA node zero
2020-02-25 14:02:07.906458: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:950] ARM64 does not support NUMA - returning NUMA node zero
2020-02-25 14:02:07.906565: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1304] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 125 MB memory) -> physical GPU (device: 0, name: NVIDIA Tegra X1, pci bus id: 0000:00:00.0, compute capability: 5.3)
2020-02-25 14:02:08.476106: E tensorflow/core/grappler/optimizers/meta_optimizer.cc:586] TensorRTOptimizer failed: Invalid argument: Input 1 of node StatefulPartitionedCall was passed float from conv2d/kernel:0 incompatible with expected resource.
2020-02-25 14:02:08.501164: W tensorflow/compiler/tf2tensorrt/convert/trt_optimization_pass.cc:183] TensorRTOptimizer is probably called on funcdef! This optimizer must NOT be called on function objects.
2020-02-25 14:02:08.523045: W tensorflow/compiler/tf2tensorrt/convert/trt_optimization_pass.cc:183] TensorRTOptimizer is probably called on funcdef! This optimizer must NOT be called on function objects.
2020-02-25 14:02:08.533388: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:839] Optimization results for grappler item: tf_graph
2020-02-25 14:02:08.533465: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:841] constant_folding: Graph size after: 9 nodes (0), 7 edges (0), time = 14.036ms.
2020-02-25 14:02:08.533501: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:841] layout: Graph size after: 9 nodes (0), 7 edges (0), time = 7.769ms.
2020-02-25 14:02:08.533535: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:841] constant_folding: Graph size after: 9 nodes (0), 7 edges (0), time = 7.076ms.
2020-02-25 14:02:08.533570: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:841] TensorRTOptimizer: Invalid argument: Input 1 of node StatefulPartitionedCall was passed float from conv2d/kernel:0 incompatible with expected resource.
2020-02-25 14:02:08.533601: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:841] constant_folding: Graph size after: 9 nodes (0), 7 edges (0), time = 8.788ms.
2020-02-25 14:02:08.533630: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:839] Optimization results for grappler item: __inference__wrapped_model_336
2020-02-25 14:02:08.533658: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:841] constant_folding: Graph size after: 26 nodes (0), 31 edges (0), time = 3.215ms.
2020-02-25 14:02:08.533687: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:841] layout: Graph size after: 30 nodes (4), 35 edges (4), time = 4.577ms.
2020-02-25 14:02:08.533717: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:841] constant_folding: Graph size after: 30 nodes (0), 35 edges (0), time = 3.342ms.
2020-02-25 14:02:08.533746: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:841] TensorRTOptimizer: Graph size after: 30 nodes (0), 35 edges (0), time = 0.325ms.
2020-02-25 14:02:08.533775: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:841] constant_folding: Graph size after: 30 nodes (0), 35 edges (0), time = 3.373ms.
2020-02-25 14:02:08.533803: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:839] Optimization results for grappler item: __inference_signature_wrapper_563
2020-02-25 14:02:08.533830: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:841] constant_folding: Graph size after: 10 nodes (0), 10 edges (0), time = 1.537ms.
2020-02-25 14:02:08.533859: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:841] layout: Graph size after: 10 nodes (0), 10 edges (0), time = 1.302ms.
2020-02-25 14:02:08.533890: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:841] constant_folding: Graph size after: 10 nodes (0), 10 edges (0), time = 1.607ms.
2020-02-25 14:02:08.533920: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:841] TensorRTOptimizer: Graph size after: 10 nodes (0), 10 edges (0), time = 0.344ms.
2020-02-25 14:02:08.533950: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:841] constant_folding: Graph size after: 10 nodes (0), 10 edges (0), time = 2.343ms.
[INFO] >>> TF Graph converted to TF-TRT Graph. Proceeding to save in specified directory !
Traceback (most recent call last):
File “/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/importer.py”, line 501, in _import_graph_def_internal
graph._c_graph, serialized, options) # pylint: disable=protected-access
tensorflow.python.framework.errors_impl.InvalidArgumentError: Input 1 of node StatefulPartitionedCall was passed float from conv2d/kernel:0 incompatible with expected resource.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File “tftrt_mnist.py”, line 44, in
converter.save(save_directory)
File “/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/compiler/tensorrt/trt_convert.py”, line 713, in save
importer.import_graph_def(self._converted_graph_def, name="")
File “/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/util/deprecation.py”, line 507, in new_func
return func(*args, **kwargs)
File “/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/importer.py”, line 405, in import_graph_def
producer_op_list=producer_op_list)
File “/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/importer.py”, line 505, in _import_graph_def_internal
raise ValueError(str(e))
ValueError: Input 1 of node StatefulPartitionedCall was passed float from conv2d/kernel:0 incompatible with expected resource.

Really need help with that !

Also, when I run:

$ saved_model_cli show --all --tag_set serve --dir ~/gideon/tftrt_model/mnist_completemodel

I get this output:


MetaGraphDef with tag-set: ‘serve’ contains the following SignatureDefs:

signature_def[’__saved_model_init_op’]:
The given SavedModel SignatureDef contains the following input(s):
The given SavedModel SignatureDef contains the following output(s):
outputs[’__saved_model_init_op’] tensor_info:
dtype: DT_INVALID
shape: unknown_rank
name: NoOp
Method name is:

signature_def[‘serving_default’]:
The given SavedModel SignatureDef contains the following input(s):
inputs[‘conv2d_input’] tensor_info:
dtype: DT_FLOAT
shape: (-1, 28, 28, 1)
name: serving_default_conv2d_input:0
The given SavedModel SignatureDef contains the following output(s):
outputs[‘dense_1’] tensor_info:
dtype: DT_FLOAT
shape: (-1, 10)
name: StatefulPartitionedCall:0
Method name is: tensorflow/serving/predict
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/ops/resource_variable_ops.py:1781: calling BaseResourceVariable.init (from tensorflow.python.ops.resource_variable_ops) with constraint is deprecated and will be removed in a future version.
Instructions for updating:
If using Keras pass *_constraint arguments to layers.

Defined Functions:
Function Name: ‘call
Option #1
Callable with:
Argument #1
inputs: TensorSpec(shape=(?, 28, 28, 1), dtype=tf.float32, name=‘inputs’)
Argument #2
DType: bool
Value: False
Argument #3
DType: NoneType
Value: None
Option #2
Callable with:
Argument #1
conv2d_input: TensorSpec(shape=(?, 28, 28, 1), dtype=tf.float32, name=‘conv2d_input’)
Argument #2
DType: bool
Value: False
Argument #3
DType: NoneType
Value: None
Option #3
Callable with:
Argument #1
conv2d_input: TensorSpec(shape=(?, 28, 28, 1), dtype=tf.float32, name=‘conv2d_input’)
Argument #2
DType: bool
Value: True
Argument #3
DType: NoneType
Value: None
Option #4
Callable with:
Argument #1
inputs: TensorSpec(shape=(?, 28, 28, 1), dtype=tf.float32, name=‘inputs’)
Argument #2
DType: bool
Value: True
Argument #3
DType: NoneType
Value: None

Function Name: ‘_default_save_signature’
Option #1
Callable with:
Argument #1
conv2d_input: TensorSpec(shape=(?, 28, 28, 1), dtype=tf.float32, name=‘conv2d_input’)

Function Name: ‘call_and_return_all_conditional_losses’
Option #1
Callable with:
Argument #1
inputs: TensorSpec(shape=(?, 28, 28, 1), dtype=tf.float32, name=‘inputs’)
Argument #2
DType: bool
Value: True
Argument #3
DType: NoneType
Value: None
Option #2
Callable with:
Argument #1
inputs: TensorSpec(shape=(?, 28, 28, 1), dtype=tf.float32, name=‘inputs’)
Argument #2
DType: bool
Value: False
Argument #3
DType: NoneType
Value: None
Option #3
Callable with:
Argument #1
conv2d_input: TensorSpec(shape=(?, 28, 28, 1), dtype=tf.float32, name=‘conv2d_input’)
Argument #2
DType: bool
Value: False
Argument #3
DType: NoneType
Value: None
Option #4
Callable with:
Argument #1
conv2d_input: TensorSpec(shape=(?, 28, 28, 1), dtype=tf.float32, name=‘conv2d_input’)
Argument #2
DType: bool
Value: True
Argument #3
DType: NoneType
Value: None

Hi,

This line is concerning in the TF output:

2020-02-25 14:02:07.898352: I tensorflow/core/grappler/devices.cc:55] Number of eligible GPUs (core count >= 8, compute capability >= 0.0): 0

However, looking around for that error, it seems to be harmless in some cases. You may want to ask on https://github.com/tensorflow/tensorflow/issues about it.


Regarding this error:

ValueError: Input 1 of node StatefulPartitionedCall was passed float from conv2d/kernel:0 incompatible with expected resource.

This seems to be possible for a number of reasons.

One possible solution is described here with keras.set_learning_phase(0) before loading the model: https://devtalk.nvidia.com/default/topic/1050006/tensorrt/incompatible-with-expected-resource/post/5356209/#5356209

Also, out of curiosity, are you doing both the training/saving and the TF-TRT conversion on the same device / same environment? That can frequently cause issues if a model was trained/saved in one environment (different TF / TRT versions), and loaded in another.

Replying (Also, out of curiosity, are you doing both the training/saving and the TF-TRT conversion on the same device / same environment?)
==> Yes ! absolutely. I am training on Jetson Nano itself in keras with tensorflow as backend. And then I get to execute conversion using TF-TRT.

About this suggested fix above: (ttps://devtalk.nvidia.com/default/topic/1050006/tensorrt/incompatible-with-expected-resource/post/5356209/#5356209)

I’ve tried that already as below, but it doesn’t affect anything !


tf.keras.backend.set_learning_phase(0)

Creating a Sequential Model and adding the layers

model = tf.keras.models.Sequential()
model.add(tf.keras.layers.Conv2D(28, kernel_size=(3,3), input_shape=[28,28, 1]))
model.add(tf.keras.layers.MaxPool2D(pool_size=(2, 2)))
model.add(tf.keras.layers.Flatten()) # Flattening the 2D arrays for fully connected layers
model.add(tf.keras.layers.Dense(128, activation=tf.nn.relu))
#model.add(tf.keras.layers.Dropout(0.2))
model.add(tf.keras.layers.Dense(10,activation=tf.nn.softmax))

Compiling Model

model.compile(optimizer=‘adam’, loss=‘sparse_categorical_crossentropy’, metrics=[‘accuracy’])


Am I correct to put it that way ?

Hi,

Yes that looks right to me based on the issue linked above. If that doesn’t work for you then I think you should try reaching out on TF/Keras forums, as this issues seems more specific to Keras than TensorRT.